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INTRODUCTION TO DATA PROCESSING* 



Introductory Remarks 

Edward M. Heiliger 

Director of Library and Information Retrieval Services 

Florida Atlantic University 
Boca Raton, Florida 

This meeting is the result of a cooperative effort by the Reference 
Services Division and the Resources and Technical Services Division of 
ALA. A committee of members of the two divisions planned this Pre- 
Conference Institute. These members were: Melvin J. Voigt, Henry J. 
Dubester, Donald V. Black, Robert Thomas, Jesse H. Shera, Maurice F. 
Tauber, Robert E. Kingery, and Edward Heiliger. Donald Wright and 
Elizabeth Rodell of the ALA staff, Ralph H. Parker and Frederick L. 
Arnold, Jr., also assisted the Committee. The guide lines laid down 
for the Committee stated that: (1) the Institute should be for general 
librarians, not specialists; (2) the Institute should be an introduction 
to data processing; (3) the program should be aimed at librarians from 
all types of libraries; (4) there should be some talk about how the new 
hardware is being used; and (5) education for this new approach should 
be discussed. Except for this last point, this has been adhered to. The 
Library Education Division is devoting a session to this subject at its 
meeting in St. Louis on Wednesday morning. The speakers for our In- 
stitute understand that their remarks must be directed to those of you 
who are coming without benefit of acquaintance with his new field. I 
hope they can overcome their knowledge-ability, and get through to you 
in a meaningful way. The speakers, with the exception of the two on 
Saturday, are all professional librarians. This, in itself, is some indication 
that the library profession is awakening to the possibilities of the new 
machinery. Some of these librarians have had many years of experience 
in this new area; others have had less, but all have become involved. All 
have something to say to you from their experience. 

*Editor's note: The ten papers which follow are revised versions of those pre- 
sented at the ALA Pre-Conference Institute held at the University of Missouri, Co- 
lumbia, June 24-27, 1964. The Institute was jointly sponsored by the Reference Services 
and the Resources and Technical Services divisions of ALA in cooperation with the 
University of Missouri. One talk is missing: Joseph Becker outlined in an informal 
manner the potential possibilities of the use of automation in libraries. He stressed 
the fact that machines are becoming more and more flexible and adaptable at the 
same time that they are shrinking in size, the introduction of transistors effecting the 
most striking improvements. What he said is generally covered by the book, Informa- 
tion Storage and Retrieval, by him and Robert Hayes (Wiley 1963, f 11.95). 

Mr. Heiliger acted as Chairman of the Planning Committee and also of the Con- 
ference; he has also assisted in editing the papers. 
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The Hardware of Data Processing 



C. D. Gull 
Professor of Library Science 
Indiana University, Bloomington 

MR. BECKER HAS TOLD YOU that the first thing to start with is 
the punched card, historically, and perhaps in your operation. 
Here is a plain punched card (Illustration 1); the little rectangular holes 
or the circular holes have been punched in with a very accurate punch. 
Mr. Becker has already spoken to you of the binary system. This card is 
a representation of the binary system because at any one of those co- 
ordinate positions, the intersection of a horizontal or a vertical axis, 
a punched hole has the equivalent of a "yes" value in the binary system, 
and the absence of a hole has the equivalent of a "no" value in that 
system. These coordinates permit you to place against each of the punch- 
ing positions a variety of values. In this particular slide you will notice 
that one of the horizontal rows is labeled "six," for example, and the 
vertical columns are numbered at the bottom from one to eighty. Each 
punching position can be identified in more than one way, and you can 
build this variety up into letters and symbols as well as numerals. There 
is no text across the top of that card. 

By putting a card through one of the devices known as the Interpreter 
(Illustration 2) it is possible to read the holes and to print across the top 
of the card the text which is punched into the card. The cards are fed 
up into the top and come out in the lower pocket, and the cards are then 
humanly readable. The text on an interpreted card is printed across 
the top to correspond with punches for the letters of the alphabet, the 
numerals, and the punctuation marks in the body of the card. The 
registry of a punched character and its printing involved is not exact, 
because in the earlier days when interpreters were first put out, they 
weren't able to print close enough to have a one-to-one relationship be- 
tween columns and characters. The primitive stage has been overcome 
in the 026 printing punch which punches and prints simultaneously. 
(Illustration 3) The young lady has alphabetic and numeric keyboards 
under her right hand as well as a number of control switches. She is 
feeding the cards in from the upper right, they pass across in front of 
her to the left, and they are stacked in the upper left position. The ma- 
chine has the capability of duplicating columns from a previous card, 

■"(Editor's note: Mr. Gull spoke informally, explaining the 46 slides which he 
showed and which pictured the equipment and methods of using it. It was with great 
reluctance that he permitted us to publish the transcription of his remarks since he 
feels them to be inadequate without the illustrations which we, because of cost and 
space limitations, could not publish in full. The Editors, however, consider his com- 
ments a useful pulling together of information. The illustrations are available in the 
publications listed at the end of the discussion.) 
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or, with the aid of a program control card in the center of the machine, 
can tabulate by columns, duplicate cards, or leave cards unduplicated. 
The 026 is a much more sophisticated device than a simple typewriter, 
with much greater flexibility. The 026 printing punch is an input device. 
It is the station at which your written text, cataloging, order slips, etc., 
are transformed into binary codes in the form of holes in cards. 

The Flexowriter made by Friden is another input device, a typewriter 
which produces a paper tape with the punched holes in it. Paper tape in 
one sense is merely a number of cards strung together, or, if you prefer, 
cards are simply paper tape cut apart. The text, however, is not visible on 
the paper tape, and consequently you do have a problem in reading and 
correcting it which you do not encounter with punched cards. Paper tape 
has certain advantages, and you will want to get your engineers to advise 
you which of these two input devices you will want to choose in your 
particular application. Paper tape is the input chosen, for example, for 
Index Medicus and the MEDLARS Project at the NLM. 

The 026 printing punch prints the characters in the same column as 
the corresponding holes. Each character is made up of a number of little 
dots which are imprinted on top of the card as it goes through the 
printing punch. The punched and printed values are the same for 
numerals, because you chose that value for the row. But when you want 
to make the alphabet out of this card, you find that the original card has 
only the capacity for the ten decimal numerals. Two additional horizon- 
tal rows, sometimes called the eleven and the twelve punches, were added. 
By grouping the twelve punch with one to nine, you can build up A 
through I in the alphabet; grouped, you see, A through I with the twelve 
punch and a sequence of numbers. There are some other combinations 
which provide punctuation symbols. The manufacturer has simply taken 
the various positions on the card and built them up to form his standard 
alphabet. Other manufacturers use different standard combinations, and 
mathematicians and others have a lot of fun making many codes out of 
the various combinations. 

The normal feed of a punched card is to put the nine edge (the bottom 
edge) through a reading machine, thus reading one to eighty characters 
at a time and correspondingly feeding impulses out to various mech- 
anisms, particularly to a printing device. Feeding on the nine edge 
normally means that you want multiple character reading from each 
card. There are a few card readers which will feed from the left edge and 
proceed from column one to eighty. These are character by character 
reading or printing devices. 

The card movement is from the right to the left, and a metal contact 
roller below receives successive electric impulses. So long as the paper 
intervenes, the electrical impulse never reaches the brush; but when it 
reaches a hole, the impulse goes on through and can be carried, to all 
intents and purposes, wherever you wish within the equipment. The 
timing of the impulses is significant. They are intermittent, and they are 
available only as each of the reading positions passes by, so there are a 



Volume 9, Number 1, Winter 1965 



dozen impulses for the movement of one card. You will recall that the 
hole has the positive value of "yes" attached to it, and the absence of 
the hole, or the state of being insulated, has the "no" value. The reading 
of a card can be described as a logical operation. The hole can be called 
"matching," and the insulated state is "not matching," or "rejection." 

The next piece of equipment is a sorter. A deck of cards is fed into 
the right-hand side of the machine, and the cards move toward the left 
and drop into pockets according to the punches which are read in a single 
column on a single pass (or sort) of the deck of cards. The blades direct 
the cards into the pockets according to the timed impulses as they pass 
through particular holes. One of the pockets is the reject pocket because 
there may be no hole in a particular column. There are twelve pockets, 
all of which are matched to some different positive value, and the 
thirteenth pocket for the reject situation. Since you can change the 
columns across the card and read different ones in succession and since 
you can specify that you may wish to read only certain rows within those 
columns, you have considerable flexibility in sorting operations. The 
physical arrangement of this equipment requires the operator to feed 
cards into one feed and to withdraw the cards from thirteen pockets, 
then repeat the process many times until the required order is achieved. 
The operator controls the order of the return of the cards to the feeding 
station. Ordinarily we are trying to arrange a deck of cards into a certain 
order. We may wish to have them in numerical order, in which case one 
pass in a column will accomplish the work, for the number of columns 
equals the maximum number of digits in the arithmetical field. If we 
have to sort alphabetically, we have to pass the card through twice on 
each column. 

The two main operations here which the operator has to observe are 
the "greater than" and the "less than" situations. Thus we add two more 
operations to our understanding. The operator must be careful of the 
direction in which he picks up which packets of cards to reintroduce 
them into the feed. The memory of how to do this procedure often rests 
in the operator's mind, or he follows the directions of another person or 
which he has worked out for himself to solve his particular sorting problem. 

Some sorters are much more sophisticated; they can count holes in 
many columns. They have multiple brushes and little visible counters 
from which you can take statistical information. They can also note 
similarities in numbers in different fields on the face of the same card, 
in the fixed field situation, or in the free field situation, and then direct 
certain cards into certain pockets. These units are often called statistical 
sorters, and they are sometimes employed in non-conventional informa- 
tion systems for the purpose of retrieving information, for asking ques- 
tions and producing answers in terms of the accession number or the 
call number of a document. 

Sorters are used to sequence cards to give a useful arrangement. They 
are used for the arrangement of cards prior to interfiling them into a 
catalog, for example. In the operations the sorter does the matching; 
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that is, the sorter makes a decision, thus relieving the librarian of that 
job. The operator, however, retains the decision of greater than or less 
than in the ordering process. 

After you have put cards into order, you are in a position to do 
something with them. We now consider the piece of equipment com- 
monly called the tabulator. There are a number of different models, and 
they are used for several purposes; but librarians are primarily interested 
in them as printing devices. One deck feeds into the machine at the top 
on the nine edge so that all eighty characters are read at once and then 
they are stacked tip at the bottom. All or a portion of each card can be 
printed at a time. The printing speeds are about 150 lines a minute. 
Until 1950 this speed was marvelous and we certainly enjoyed the sound 
of this machinery clanking along rhythmically because the sound meant 
that we were getting printed output, but today it is a very slow speed. 
Since these primarily are accounting machines, they will add and sub- 
tract numbers; furthermore they will give you minor, intermediate, and 
major totals which means that they will group certain types of informa- 
tion and then do additions afterward. 

There are other devices in the punch card line which will perform 
multiplication and division, as well, and produce new punched cards. 
These are various card calculating devices. You may take the new cards 
and put them back through tabulators for printing out, for example. 
The arithmetical capability of these printing machines is primarily of 
use to librarians in order department work, for personnel records, etc. 

Illustration 4 shows how tabulators are wired. As the card is being 
read, an impulse passes through the hole, over to an external area and 
through a control panel; then the impulse goes back into the tabulator 
to actuate a solenoid to strike a piece of type on an elevated typebar; 
the impression will go through the ribbon and put a character on a piece 
of paper. This slide shows why the machine clanked along, because when 
all eighty of these typebars are in proper position, the eighty solenoids 
for the hammers all hit at once, and a whole line is printed at a time. 

The plug board is a very interesting part of the equipment. It will 
permit the operator to break a circuit or to continue it, and may permit 
him to switch a particular impulse to a new location or perhaps into 
multiple locations. The basic principle was very difficult to unearth in 
the punched card literature of the 1930's. It is similar to travelling to 
the fork of a road, where you can choose either fork and proceed to 
another fork and repeat the process. The plugboard offers great flexi- 
bility in directing an impulse wherever you want it to go. 

The real disadvantage of tabulators for librarians is that they offer 
only a very limited set of characters. Some of them are as limited as 
thirty-nine characters in the set, that is twenty-six letters, ten numerals 
and three punctuation marks; others have sixty-four characters in a set. 
This limited quantity is in upper case characters only; generally, there 
are no lower case characters and very little variety in symbols. While we 
think of our alphabet of twenty-six letters as being a restricted one, yet 
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capable of writing everything, in practice we use much larger alphabets. 
The character set used by Chemical Abstracts contains nearly a thousand 
characters. This particular type of printing restriction has been very 
serious indeed, and is, I think, responsible for the lack of enthusiasm 
with which librarians view the possibility of printing with the tabulator. 

Illustration 5 shows the plugboard which goes with several of these 
devices. Each one of the little wires can be removed and placed some- 
where else. The operator can wire a bewildering variety of programs into 
a tabulator with a plugboard for routing the impulses. This characteristic 
of the plugboard must be understood clearly. Each new job means that 
the plugboard has to be rewired by hand and checked for accuracy. This 
rewiring is not required with computers, because computer programs are 
written and placed in the computer's memory to accomplish the control 
of the operations. 

A little more advanced device for printing, is a print wheel. Instead 
of having the type bars move up and down, the bar, now wrapped 
around in a circle, is a wheel which is rotated into position. The type 
is struck against the ribbon and paper by a small hammer. This arrange- 
ment affords a little greater speed and a little larger set of characters. 

One deck of cards is often not sufficient, so you need extra decks. 
The reproducer is the machine by which you accomplish this job. You 
feed the deck you want copied into one of the feeds and blank cards in 
the other, and you get out two identical decks. You can operate with 
both decks, put the second deck in a new order, for example. But this 
device doesn't print; it only reproduces. If you want to read what is on 
the card, you have to take the new deck to the interpreter before you can 
use them manually and visually. 

The most sophisticated device in the punched card line, is the collator. 
It feeds two decks simultaneously; usually these decks are in the same 
order, but there are some situations where you may ignore that require- 
ment. Instead of comparing only a column at a time, as the sorter does, 
these may compare from eight to sixteen columns of alphanumeric in- 
formation rather than just numeric data. In other words, the collator is 
a word-comparing machine, whereas the simple sorter is a character- 
comparing machine. We need to know what can be done with cards on 
a word-by-word basis. The collator permits us to select or reject cards 
by the matching operation; this is a form of retrieving information. The 
collator permits us to compare values by the greater than or less than 
operations, as the equal or unequal comparison. It enables us to ac- 
complish a sequence check; we can put some cards into a collator and 
find out if they are in the right order and check them for ascending and 
decending order before using them further. 

All of us have had to interfile cards; the ones we take off the sorter, 
for example, which were in alphabetic order, have to be interfiled some- 
where with another deck. The collator permits us to do our filing or 
merging, and this can be done on the basis of a common number or on 
the lack of a common number. The collator can also be used for re- 
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trieval operations, by putting two decks in, each of which represents 
a single subject heading, for example, and looking for common numbers 
in other fields of those decks. If a common number is found, the cor- 
responding cards are selected out and they are presumed to contain the 
answer to the retrieval question. This matching is the logical operation 
called "and," logical product, or logical conjunction. "And" is one of 
the new operations that you are to remember, then. 

If two decks are merged by the collator, in the absence of numbers 
or by ignoring the numbers, making one deck out of two, the operation 
is called logical "or," logical alternation or logical sum, or as some prefer 
to call it, disjunction. The important point about these operations is 
that they are controlled by the equipment on the card-by-card level. The 
operator makes the choice of logical operation at the group level, but 
the individual decisions are made by the machine on the individual 
punched cards, and you work only with the results. This description is 
equivalent to saying, "I will go to a subject heading in the dictionary 
catalog for retrieval purposes and pull out five hundred cards on that 
subject. Since I want to consider those cards which are related to a 
second subject, I have to look over all five hundred cards one by one; 
but with punched cards and a collator, I can turn that job of scanning 
over to the collator, and it will select all those cards which show the 
relationship between the two subjects." The collator has relieved the 
human of this particular decision problem. Now the operations AND 
and OR can also be accomplished with sorters, but humans control 
the operations with those devices. 

At this stage in the punched card art, which was roughly in the 
1940's, all of the ideas for computers were available in physical form. 
There were input, processing, and output of data. The punched card 
was a form of memory, in the data recorded thereon. The control pro- 
cedures were largely external, and the humans had to establish them. 
Therefore the memory of these procedures rested in the human mind 
or in manuals, with all their defi deride!;. Because operators had to feed 
cards and take them out of pockets and put them bad; in again, human 
intervention was very frequent, livery situation which requires human 
intervention introduces high error rates. Our technology has gone be- 
yond this stage and so we must explore the advantages computers offer. 

It required some very ingenious people, von Neumann and others, 
to conceive the idea of putting the instructions, die procedural control, 
and successive amounts of data into an internal semi -permanent but yet 
erasable memory within a computer. The data establish lire conditions 
which exist, and these conditions then control the subsequent operations 
which are performed on the data. You can express it very simply this 
way: if X exists, then do Y; if X does not exist, do Z, a different opera- 
tion. Now, today, the external memory is kept principally in magnetic 
tape form, although you can use punched cards and paper tapes for 
this purpose if you wish. The power of computers, which we are coming 
to, is derived from their logical capabilities, MATCH, AND, OR. and 
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NOT, GREATER THAN and LESS THAN, and from the number of 
commands which you can give to the computer in its machine language. 

Mr. Becker already mentioned the number of parallel operations 
which can be accomplished in computers, the amount of two-way com- 
munication which exists among the components of computers, the num- 
ber and variety of peripheral devices which can be hitched on to com- 
puters. He mentioned their operating speeds, and then finally the question 
of how much software is available from each manufacturer for his par- 
ticular computer. We can't generalize about the seventy models of 
computers that are currently available in this country; I'm not going to 
try to do so. The important point about computers is that human in- 
tervention is very much reduced for the amount of work accomplished. 

Magnetic tape codes can be developed on magnetic tape so that you 
can see them. The magnetic tape is the same kind of tape in principle 
that you use for tape recorders at home; a mylar ribbon with finely 
divided iron oxide on one surface. There are a number of tape codes 
and attempts at standardization in spite of the proprietary interests in- 
volved here. There can be 200, 556, or 800 characters to the inch, so that 
the packing densities of these tapes is much greater than anything on 
punched cards. 

Illustration 6 shows a 1401 computer, a character by character ma- 
chine. The manual controls and the indicator lights are shown here on 
the panel, and the cabinet contains the memory, the central processor, 
and the power supply. Since the memory here is relatively small, the 
computer requires small programs and can process only small amounts 
of data at a time, because the memory holds both the program and the 
data in the same physical location. The control consoles are different 
on the models available in this country, but in general they have start 
and stop buttons, switches to establish certain conditions that are re- 
quired for operations, and lights to show conditions and errors. There 
will be buttons by which to override the errors and try the system again, 
that is, to force something through the system. 

Information must be fed into the computer just as into the punched 
card devices which were its predecessors. The 1402 card reader punch 
is an input-output device. The stacker holds about 3000 cards and they 
are fed in rapidly to introduce information into the central processor of 
the computer. It can also be used as an output device by routing the im- 
pulses out of the computer to punch cards, with which you can operate 
punch card devices or use the cards manually if you interpret them. 

Two kinds of things can be put in here, the programs to control the 
operations, and the data; you can also take out data and modified pro- 
grams. The production of modified programs is a significant capability 
of all computers. There are programming routines known as editors 
and compilers which enable you to accomplish a pretty sophisticated job 
of changing your programs. The final program is sometimes pretty far 
removed from the work of the original programmer after it has been 
edited, compiled, and assembled on the computer; it hardly will recog- 
nize its intellectual parent, if you want to put it that way. 
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Illustration 7 is a picture of the magnetic cores and Mr. Becker de- 
scribed them so fully to you, that I'll only say that you test the state of 
the core with a diagonal sense wire and read that state out and do some- 
thing with the answer you found for that particular position. A number 
of positions taken together give you a character or word. These cores 
are rather tiny, about the size of a sequin, and many thousands of them 
are used in each computer memory. They are assembled into a core 
plane, and the planes are assembled into blocks. The result is a core 
memory in block form, with thousands of wires leading out from it. For 
easy understanding you can consider that part of the memory con- 
taining the program is static during the running of an operation, but 
the remainder is very alive; it is handling the data very rapidly. The active 
part processes data at greater speeds than does the human brain. 

The punched card devices are, in effect, so far as sorters and collators 
are concerned, extensions of human hands and arms for the manipula- 
tion of 3 X 5 cards, but they are so slow that some other equipment had 
to be developed to overcome this slowness. The magnetic tape handlers 
(Illustration 8) are one such development. A full reel of tape is put on 
one side of a unit and a take-up reel on the other, and the tape is moved 
forward and backward under the control of the computer program. The 
tapes contain programs and data. Usually tape handlers are used in 
groups, standing side by side. Not only do these units surpass humans in 
sorting things out and interfiling, they also read and write at the same 
time. As the tape moves, the magnetic reading head takes the impulses 
and puts them somewhere in the computer. After the impulses are 
processed to the computer's satisfaction, they are read onto another tape. 
The magnetic tapes are erasable and can be used thousands of times. 
As to tape speeds, the tapes may move 75 to 125 inches per second; the 
units are remarkable instruments for acceleration and deceleration; they 
start a tape from scratch and stop it very rapidly. The read-write speeds 
exceed anything the human can undertake; they range from 20,000 to 
100,000 characters per second. Although this high speed is precisely why 
magnetic tape is used as an input-output medium to a computer, modern 
computers are still largely tape-bound for most operations. The machine 
processes are retarded by the read-write tape speeds, because the internal 
processing is so much faster. 

The data and program information flow to and from the memory, 
buffers, printers, punches, processors, and tape handlers. There is con- 
tinuous flow of impulses within the computer, but the physical motion 
is intermittent. There is a considerable flow of impulses to and from 
them for every physical motion observed. Information is transferred from 
one tape to another so that extracting or interfiling means writing on a 
fresh, reusable tape. It is not necessary to cut a tape apart, add a strip, 
and seal the parts together to interfile something. Information is taken 
off one tape and written on another one to expand (or contract) a 
tape. 

Some device is required for printing out the results. One such is a 
1403 printer. The impulses are obtained by reading a magnetic tape; 
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the central processor arranges the characters according to specifications 
and sends them to the printer. The 1403 uses a print chain; it looks some- 
thing like the timing chain in an automobile, but with type in place of 
teeth. There are 132 printing positions across the paper for each line. 
The print chain is divided into five sections of forty-eight characters 
each, and it is in constant motion during printing. The printing speed 
is about 1100 lines a minute for numerical information and about 600 
lines a minute for alphanumerical information. In operation these 
printers sound like a hail storm, because each character is struck individ- 
ually in the very brief period when a line is available and when each 
character arrives in the proper position. The 132 characters are struck at 
different instants for a line, instead of together in one blow. At these 
speeds the paper has to be fed from endless, perforated, folded piles. 

This print chain has twenty-six capital letters, ten numerals, and 
twelve symbols. At the insistence of documentalists, a few chains have 
been made with upper and lower case characters. The effect of this 
change, against a fixed rotating speed and a fixed number of positions 
on a chain, is to reduce the printing speed, because the character wanted 
doesn't come into printing position as frequently as it does in five sets 
of 48 characters each. The price of typographic excellence is to reduce 
the printing speed by about 50%. Chains can be changed in about two 
minutes, however. There are other types of computer printers; some of 
them print line by line; some use cathode ray tubes and xerography to 
print on the paper at rates as high as 5000 lines per minute. 

The most significant technical development of the Index Medicus 
or MEDLARS project was GRACE, Graphic Arts Composing Equip- 
ment, or the Photon 900 as it is called commercially. It was designed to 
provide a greater set of characters, 226 characters in the set, at 330 char- 
acters per second exposed to negative film. The GRACE type of printing 
will be used for the August issue of Index Medicus, because the machine 
is now in the National Library of Medicine for its acceptance trials. 
The entire July issue is being run through as part of the acceptance trials. 
It's turning out very nice copy, upper and lower case characters, bold 
face italics, and Greek symbols, etc. 

Paper tape is one form of input. Some computers can provide a paper 
tape output as well as using a paper tape input, and this output can be 
used to actuate typewriters and some other printing devices. 

The 1401 is one of the simple small computers, a character by char- 
acter machine. Most of the larger computers operate with computer 
words rather than with characters. Computer words are not quite English 
dictionary words, but they have a larger number of bits than characters. 
The word size may range from 20 to 42 bits, and the words are handled 
effectively as units in the machine. These other computer models are 
larger, in performance and physical size, than the 1401. This condition 
is not a contradiction of what Mr. Becker said about computers growing 
smaller and smaller in recent years. In general, the physical size of 
modern computers varies directly with capacity, but even the largest 
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computers today are smaller than the earliest large computers. Some of 
the larger computers can handle as many as 66 tape handlers at once. 
It would take a regiment of librarians to compete with these computers. 
The memories are larger too. The common sizes of magnetic core storage 
for these word devices are 8,000, 16,000, 32,000, 64,000, and 128,000 word 
memories, for example. There are also thin film memories. 

We haven't mentioned the problem of order of records which is a 
really difficult problem in librarianship. All of these magnetic tape de- 
vices have a linear or sequential pattern on the tape, reading from one 
end of a scroll to the other, reading forward or backing up if you wish. 
Some people don't like linear scanning because of the amount of time 
consumed or the amount of processing required. They would rather 
have a computer analog of the dictionary catalog, or the inverted file 
which is broken up into a discrete order. With this order they can go 
to a physical location and find a piece of information. 

Magnetic disc storage is made, in effect, of constantly rotating phono- 
graph records with magnetic characters on the top and bottom of each 
disc. There is an access arm which can move vertically up and down the 
stack of disks and horizontally in toward the center of each of those 
discs. It has a read-write head. If you tell it that all the information on 
a certain problem is to be found in a certain location, it will shift to 
that location and read that information into the central processor. This 
action is called "random access" and for librarians' purposes that's a 
very poor choice of words. Each location is known exactly, just as surely 
as you know your house address. Your house address may not have any 
particular relationship to any sequence of numbers and words, but you 
know where it is, and the postman knows where it is. Random access is 
a direct addressing form of storing data. Random access provides a way 
of getting at what you want without going through all the tapes. It has 
other advantages as well. A similar device with much less capacity is 
the magnetic drum in which the magnetic characters are put on the sur- 
face of the drum, and the drum rotates at constant speed. 

There are additional peripheral devices which offer a variety of remote 
input and output stations which can be connected to computers by di- 
rect wire or by radio or you can send tapes and cards through the mail. 

PUBLICATIONS FROM WHICH SLIDES WERE TAKEN 

1. Becker, Joseph and Hayes, Robert M. Information Storage and Retrieval: Tools, 
Elements, Theories. New York. Wiley, 0963. xi, 448p. 

2. International Business Machines Corporation. Data Processing Division (112 E. Post 
Road) White Plains, New York. General Information Manual. An Introduction to 
IBM Punched Card Data Processing. White Plains, New York. IBM 1962? 2op. 
#F2o-oo74. 

3. International Business Machines Corporation, Data Processing Division. White 
Plains, New York. General Information Manual: Introduction to IBM Data Process- 
ing Systems. White Plains, New York, IBM, 0960. 95P. F22-6517. 

4. International Business Machines Corporation. Data Processing Division, White 
Plains, New York. General Information Manual. IBM Tele-Processing R 557 Data 
Collection System for Manufacturing Organizations. White Plains. IBM, C1960. i2p. 
E20-8042. 
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Computerized Cataloging: 

The Computerized Catalog at Florida 

Atlantic University 

Jean Perreault 

Head of Information Retrieval Services 
Florida Atlantic University Library 
Boca Raton, Florida 

THE ORIGIN of any event or process is not a simple thing, even 
though we can view the concatenation of forces as simple when we 
are concerned with the result by itself; but such a view, of an effect with 
no consideration of the causes, would be no explanation at all. These 
causes, though they all have simultaneous effect in our resultant pro- 
cedures, must be examined separately for a clear insight into the essential 
newness of our attempt at the Library of Florida Atlantic University, to- 
gether with a realization of the continuity of our solution with that re- 
sulting from the traditional theories of the catalog. 

Therefore, before I launch into a full-scale description of our pro- 
cedures, there are three introductory aspects to be examined, plus a 
couple of digressions. 

Introduction (a) 

The first basic introductory aspect I want to examine is that of the 
image, "the bibliographic string." The image implies seriality of ele- 
ments, a seriality imposed and not necessarily already contained in the 
data to be cataloged. (This seriality might be defined as: The making 
explicit of what is implicit in the data.) The "string" must be "strung 
out" rather than left in its tangled original state: order must be imposed, 
and this order is that of the traditional theory of descriptive cataloging. 
All cataloging, as it has been developed from the late Renaissance up to 
the most current practices, has been based on some string-concept or 
other, that is, on some procrustean order imposed on the data. 

And this basic order has had as its natural concomitant the articula- 
tion that determined the seriality of the order being imposed. That is, for 
there to be order there must be parts contiguous to parts; and for there to 
be seriality there must be a pre-determined before and after. The parts 
of the bibliographic string can thus be conveniently conceptualized as 
"1, 2, 3, . . . n." 

Each of these articulations can, in terms of the original figure, be 
visualized as a "knot" in the string, as a signal identifiable because of a 
particular kind of indention, type-face, color, punctuation, or associated 
number. In other words, each such "knot" bears a particular function, 
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and these functions are generically and specifically differentiated, and are 
recognized without their differentiation being in every case made per- 
fectly explicit. We can think, for instance, of non-filing initial words; our 
attempt has been to render each such function entirely explicit. 

Introduction (b) 

The second basic introductory aspect I want to recount is the histori- 
cal one. The proximate origin of our ideas and procedures has been the 
investigations carried out at the Chicago campus of the University of 
Illinois. Among the principal protagonists in that project were C. D. 
Gull, Don S. Culbertson, Louis A. Schultheiss, and Edward Heiliger. The 
Chicago campus investigation's foremost outcome was the publication 
of Advanced Data Processing in the University Library, the main point of 
which (over-simplified, of course) was that computerization of the cata- 
log (and, in the broader view, of the library in large part) was both 
possible and (possibly) economical. 

Then, during consultations between the F A U (Florida Atlantic 
University) staff and Louis Schultheiss, several modifications were ar- 
rived at in the provisional design worked out at the Chicago campus. In 
particular, it became evident that economy of space was a crucial con- 
sideration, and that the serialities traditionally generated in a biblio- 
graphic string may be more complex than necessary for their proper 
functioning. Thereafter, our whole effort was expended to make the 
knots in the string both unique and unambiguous, yet to let their position 
on the string be flexible enough (a mixed metaphor?) to conserve all the 
space that was thought possible. Some of these modes of flexibility will be 
more precisely explored later. 

Further consultation, this time with Fred Kilgour of the Yale Medical 
School Library, resulted (a) in further modifications of our provisional 
design, in order to make our system as compatible as practicable with 
that of the Columbia-Harvard- Yale Medical School Library project; and 
(b) in a large measure of agreement on the limited number of characters 
that can, economically and without intellectual loss, produce the catalog 
we have come to expect for a respectable research collection. (The print 
chain implied in all of this will also be more fully described later). 

Ruling over our historical origins, thus, has been cooperation. We 
hope that this guiding concept will become that of the computerization 
of libraries all over the country. 

Introduction (c) 

The third basic aspect I want to examine is the theoretic one. We may 
say that one of the basic questions upon which the computerized catalog 
(like every one) is based is: How can the intellectual decisions the cata- 
loger makes become embodied in a catalog? But then we must first ask: 
What is a catalog, as against an inventory? or as against a bibliography? 
Let us take a (perhaps illegitimate) shortcut here and not too closely 
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examine these different kinds of lists (for they all are lists), as worthy of 
question ;ts these others may be. Instead, i would offer as a definition 
(that is to say; a setting up of limits) that a catalog is a list, from the in- 
ternal contents of each item within which there are generated other 
(quasi-) items. In less abstract (= more library -oriented) terms I mean: 
There is more than one access-vista to the same document in a catalog. 

The catalog thus creates a record of the document in order to "enter" 
the document in the catalog, but this document is to be entered in the 
catalog (and the user can enter into it through the catalog) by a pos- 
sible variety of entries other than the first one. Thus is logically generated 
the main entry and the various secondary entries. All of these entries 
must not only be embodied in their appropriate places in die catalog, but 
must be traceable back to dieir decision-source: the tracings. 

In the traditional cataloging operation, the official catalog is the 
record of the decisions of the cataloger: diis record, before it is filed for 
possible later reference, is turned into main and secondary entries by 
typists, and these embodiments of the cataloger 's decisions are placed in 
order and inserted into ihe already serialized file, by clerks. 

But, as noted before, the embodiments that arise from die official cata- 
log do so implicitly, except where there are explicit tracings. And even in 
such case, proof-reading of the typed cards is no guarantee at all that the 
embodiments are equally correctly typed, or have been filed correctly. 

Correctness means conformity to rule; and the most logical means to 
guarantee strict correctness is to generate all the secondary entries at the 
same time that the main entry is generated, rather than all the others 
Irom the one (as with traditional official catalog generating the public 
catalog). In the traditional catalog, however correci the results may be, 
in a sense they can be said 10 be "accidentally" correct. Taken in the pre- 
cise sense, accidental means that which does not invariably follow from 
the nature of the substance; the "substance" in this case is the cataloger's 
decisions. And it is thus clear that a system which could "naturally" 
generate its results with full explicitness, a system in which every quasi- 
item was generated by "chromosomes" imbedded in its own substance 
(the knots in the bibliographic string) — would be by that much prefer- 
able. (This cotdd be true of an un- computerized system as well, but I 
doubt that it could be accomplished outside the computer.) This very 
built-in explicitness is what the F A U system aims to achieve (as do those 
of the Cohmibia-Harvard-Yale Medical Schools, Ontario New Libraries 
Project, and others). 

Digression i 

Let us digress momentarily. Even in a computerized system, something 
akin to the "collective consciousness" (whose presupposition and oc- 
casional lapse result in either correct or incorrect filing in the traditional 
library) is attempted, for instance in setting up a table of non-filing initial 
articles. Such a word is the German "Der" which can be a truly tton-filing 
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article (for instance, Der Mann . . .) or a filing article in the genitive (for 
instance, Der Manner . . .); or the Dutch "De," which, though rccm-filing, 
can easily be confused with the Latin preposition "De" which does file. 

This sort of ambiguity, the source of filing errors in traditional filing, 
could equally become a cause of error in a system relying on tables that 
must be used to look up decisions, since in the computer there can be no 
discrimination between identicals. It seems better to us to carry explicit- 
ness all the way, to indicate every time, automatically (by "programming 
the cataloger") that here "Der" is non-filing, there it is not. The means 
by which this is accomplished will be examined later. 

Digression 2 

Another digression: the lecture on the IBM 357 Data-Collection 
circulation system will show that because a process in its traditional shape 
is easier to diagram, we cannot immediately conclude that the more- 
complex-to-diagram automated system is therefore really more difficult to 
handle or to operate. The reason is, once more, that the traditional 
system bears along with it all too much implicitness. 

Description of the System: Input 

And now, to begin to understand our actual cataloging process at 
F A U, let us take the sample Catalog-Input Record and fill it in. 

Shown here in Figure "a" is an imaginary title-page. The author is of 
course not to be expected to be given in full or in the correct spelling, 
but by the aid of reference sources, we determine the correct full form of 
his name, and print it into Area 10 of Figure "b."* We have encountered 
the first instance of the normally-tangled state of the bibliographic string, 
upon which we must impose our own order. 

The next instance of a tangle to overcome is the title, which (we 
shall assume) conflicts with several other editions of the same work; so 
we determine its original form from reference sources and print it into 
Area 22, surrounding it with brackets and indicating language and ver- 
sion. The publisher's title is then printed into Area 23, excluding ex- 
traneous elements, and then the imprint and collation are printed into 
Area 31, in pre-determined rather than given order. Into the imprint 
area is also printed the series note, in the form in which it has been 
established rather than that given. 

The applicable subject headings are printed into Area 70. Then, in 
Area 76, we trace the name of the person responsible for the literary shape 
of the work. There is no need here for a title-area tracing (as will be ex- 
plained later), so we print the call-number into Area 80, and then the 
location symbol and modified Luhn-number, which takes the first four 
letters of the main entry, three significant letters from the publisher's 

* Editor's note: Mr. Perreault presented many detailed figures illustrating the 
various steps in preparing information for the computerized catalog; unfortunately 
we could not publish them all, but anyone really converting to this form of cataloging 
can probably secure copies by writing him directly. 
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title, the letter that stands for the month, and the day in that month that 
the book is either ordered or cataloged. 

The expanded-collation code is now printed into the columns at the 
right end of the form. The main entry is personal, so we put a check in 81. 
The work, a reprint, was originally published at Leipzig, in 1891, by the 
publisher Breitkopf utid Hartel, so we look up the code numbers in the 
book we have produced for that purpose and print into 81:6— 19 the 
code numbers 51120, 267, and 140000. Then we print the original copy- 
right date into 81:24—27, check 81:30, 31, and 33 since the book is a 
translation, a critical edition, and a reprint. We also check 81:38 because 
our examination of the book shows a listing of editions and translations 
of the author's works. The book is a reprint by our definitions, so we go 
on to column 82 and print in the code number for the reprinting pub- 
lisher (Musica Press) and the date of the reprinting. 

Now we examine the body of the book, noticing that it has an index, 
a glossary, and a facsimile of a page of the original issue. So we check 
8a: 35- 3 6 - and 28, and since the facsimile is in the form of a plate not in- 
cluded in the pagination, 82: 17. It also includes a recording of the organ 
music of the author in a pocket in the back, for which we check 85:26 and 
trace (in Area 7C) the performer. 

Description of the System: Output 

Now that you have become somewhat more familiar with the mode of 
input ol the catalog data, let us examine some of the rules, purposes, and 
results of this data as it is keypunched, transformed into a magnetic-tape 
record, and finally printed out in book form. (The discourse will be made 
more meaningful to the reader if he refers to the copy of the Input Re- 
cord Form — Figure b.) 

The main entry, whatever its type, is printed into Area 10. If it is a 
personal main entry, only the first two lines may be used (76 spaces), 
and the appropriate check is made in Area 81 (space 1). If the main 
entry is any of the other types, one of the other spaces in Area 81 (spaces 
2 — 5) is checked, and the whole Area 10 may be filled in. 

If there are elements in the main entry that would cause misfiling, 
ihey are blocked out of the sort-tag by symbols used for that purpose 
alone; the symbol "less- than" (<) means "What follows, up to the filing 
symbol (if one follows), is to be excluded from the sort-tag"; the symbol 
"more-than" (>) means "The sort- tag begins here (if there were a non- 
filing symbol at the beginning of the area) or resumes (if the preceding 
non-filing symbol were later than the beginning of the Area)." 

The third and fourth lines of Area 10 have the same number of spaces 
as do the two lines of Area 22. This latter is for the conventional title, 
and can be used only when the main entry is personal. (The computer is 
informed of this by the check in 81:1.) Thus, the total number of spaces 
available for main entry plus conventional title (when present) is 148. 

The title of the work, when it is not superseded by a title main entry, 
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is printed into Area 23, using the same non-filing symbols when necessary, 
and indicating the end of the title-to-be-traced by a non-filing symbol. 
Considerable pruning in this Area is sometimes necessary to make the 
operation of the computer economical and rapid. Only the essential ele- 
ments are retained, and these we construe to include: (a) the actual title, 

(b) the alternative- or sub-title when necessary to explain the actual title! 

(c) the person or organization responsible for the work's bibliographic 
form, such as editor or translator if this has not been stated earlier (for 
instance in the main entry or as part of the actual title), and (d) the 
number of the edition. 

The imprint, collation, and certain essential notes are printed into 
Area 31. The most common acceptable note is that for a series, if it is to 
be traced {the process of decision whether or not to trace will be ex- 
amined later). If the series note is to be traced in title form, punctuation 
that would make it look like an author-title series is removed and the note 
is preceded by an equal-sign; if it is one which cannot be treated as a title- 
lorm tracing, and accordingly must go to the main entry catalog it is 
preceded by a per-cent.-sign. 

The point of all tfiis is that no further tracing effort is necessary be- 
yond printing the note into its proper place and adding the proper trac- 
ing symbol. (Note that again the non-filing symbols are often called for.) 

Other permissible notes are for the original title of a translation, 
change of serial title, and the like. 

Tracings are printed into Area 70, 76, and 78. Since our catalog is to 
be div!ded (Author, Subject, and Title) rather than in the dictionary 
lorm, we do not adhere to the L C tripartition into Arabic numbers, Ro- 
man numerals, and parentheses; instead we retain all Arabic-number 
tracings in Area 70, but divide Roman-numeral tracings between Areas 
76 and 78, and, when series must be traced in one of these Areas rather 
than in Area 31 with the functional symbol, they too must be divided be- 
tween Areas 76 and 78. 

The call number is printed into Area 80, as is the symbol for location 
(when appropriate), and the modified Luhn-n umber which serves to tie 
together the deck of key-punch cards. 

Into Areas 81, 8a and 85 are printed data which in traditional cata- 
loging might have been part of the collation. (The retrieval aspects of 
these bodies of information will be discussed later.) 

When the bibliographic string is available in a form closely akin to 
our own decision-pattern, the only indications necessaiy to make the data 
computer-assimilable are the "knots." This outside-available form of in- 
formation is the L C printed card; all that is needed to make this informa- 
t.on equally intelligible to the keypuncher is to associate the proper 
functional area-number with each section of printed data, besides excis- 
ing those elements which would not be allowable under our policies of 
retention of essentials only, or would overflow the allowable total lenetb 
of any Area. (See Figure c) 
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When the Library of Congress has cataloged the work and the printed 
card itself cannot be supplied, a xerographic copy of the appropriate 
entry in the National Union Catalog is made through a special mask. 
Then the same process is performed as when a card can be supplied. 

Digression 3: The Print Chain 

To digress a moment from the process, the basic theory behind the 
construction of a catalog as such is that it is to provide multiple access- 
vistas to one document among a great many; it is thus (at least primarily) 
a finding tool. Yet what is being found is bibliographic data, and its or- 
ganization is along the line of principles devised for solving bibliograph- 
ical problems. There are thus poles between which we might well vacil- 
late, or which might generate tension. These poles are (on the one hand) 
machine economy in its widest sense, and (on the other) "bibliographical 
integrity" (a phrase of Fred Kilgour's). For our computerization to be a 
success it must be economical, it must not cost more per unit of work done 
than the traditional process does; and its general form must be happily 
compatible with the capabilities of computers. For our computerization 
to be a success it must also satisfy all of the bibliographic/finding-tool re- 
quirements put upon the traditional catalog; we must keep faith with the 
development of the tradition of the catalog, from the late Renaissance up 
to yesterday. 

Part of our committment to this tradition is the shape and function of 
the large elements of the catalog; but another (and equally crucial) part 
is the shape and function of its smallest elements, the letters and symbols 
available for the embodiment of the intellectual decisions of the cata- 
loged through the agency of which the user can apprehend the content 
of the collection and the individuality of the works entered in it. 

The letters and symbols available must be various enough to embody 
all of the material which a large research collection must contain, that is, 
a great many languages other than English. They must therefore be far 
more than the 48 characters normally supplied by I B M. (Indeed, these 
48 are not even adequate to catalog an entirely English-language research 
collection.) The normal IBM print chain includes no lower-case letters; 
present are only the period, comma, slash, hyphen, equal-sign, apostro- 
phe, parentheses, and plus-sign as punctuation; no diacritical symbols 
are present; numbers (zero — nine) are of course supplied. We have as- 
sembled a new 88-character print chain from the various IBM catalogs, 
and with this design have achieved a measure of agreement with the 
Medical School Libraries of Columbia, Harvard, and Yale (a union-cata- 
log project), and with Ontario University. These institutions, planning 
lor collections of different character from ours, have so far added six 
characters to the original 88; besides, there are present on the chain 
(which has 120 spaces for symbols) a variety of symbols for the use of the 
programmers and for scientific-computation use of the computer. 

The additions to the 48-character chain therefore are: lower-case 
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a z; the script L; the colon and semi-colon, the underscore, question- 
mark, square brackets, musical sharp and flat, dollar-sign, asterisk, accents 
acute, grave, and circumflex, the angstrom, umlaut, tilde, and cedilla. 

Description of the System: Retrieval Aspects (Expanded-Collation Code) 

To return to our processes: The information printed into Areas 81, 
82, and 85 is not actually part of our conception of a ftooA-catalog, but 
rather is intended as a means of document retrieval. The basic concept 
was arrived at during our consultation with Louis Schultheiss and went 
something like this: When there are generically similar but specifically 
differentiated elements none of which can occur simultaneously, they 
may legitimately be input under a single functional number with a modi- 
fying number used to show the differentiation. In particular, the main 
entry can be either personal, corporate, uniform, anonymous-classic, or 
title; each of these could have been allocated a separately-coded area, but 
a great deal of space would have been invariably left vacant, since only 
one such area could be used for any particular work. Again, in the area 
for imprint, etc., there could be four solutions: (a) a separate field could 
be left for each element (place, publisher, date, pagination, other colla- 
tional items, series note, etc.) with at least an occasional instance of 
truncation of one or more of these internal elements; (b) an open-field 
situation as against the fixed-field arrangement just outlined, with one 
maximum for the whole area, into which each item is printed, with its 
own sub-area number; (c) an open-field situation with an accompanying 
field of yes/no checks to indicate the characteristics of the internal ele- 
ments; and (d) translation of the total verbal content of the area into nu- 
meric codes, to be re-translated into verbal output upon print-instruction. 

Whatever the advantages or disadvantages of the other possible solu- 
tions, we at F A U have chosen solution (c), and the accompanying field 
of yes/no checks is what we have expanded into our retrieval-collation 
code. In it are recorded the decisions as to type of main entry; as to pres- 
ence of a large variety of collational elements for all sorts of learning re- 
sources (not books alone), for descriptions of the work in terms of its 
literary origin and provenance; and for encoding the numerical codes 
for imprint retrieval. 

With this device we can retrieve the relatively few documents which 
have a large variety of desired characteristics many of which would not 
be available through even the most thoroughgoing traditional cataloging 
For instance, we could extract a list (of either call-numbers or the whole 
catalog entry) of those works bearing not just one subject-heading but a 
large number, for instance "A" or "B" plus "C" or "D" or "E" plus "F" 
besides specifying that we want only those published in a particular city 
or cities, country or countries, between certain dates, including an index 
and a bibliography that goes beyond (foot-) notes, and certain tvpes ol 
illustrative materials. Except for the subject-heading search, even one 
such criterion is all too much for a traditional library; we propose to 
Volume 9, Number 1, Winter 1965 . 2 y . 



perform such searches on a more or less routine basis (though to do so 
economically we must use batch-processing or random-access devices). 

Work Flow 

Let us now follow the outlines of the work flow. The card accompany- 
ing the book, or the xerographic copy of the N U C entry, or the original- 
cataloging transmittal sheet, is compared to the author and title authority 
files (to be explained later) to see if any new decisions or alterations are 
necessary. The cataloger then connects functional numbers to the ap- 
propriate data (we assume the presence of the L C card) and fills in the 
collation code. The books are sent to the shelf while clerks complete the 
operation by filling in the numerical codes for city, country, and pub- 
lisher, and prepare subject-authority references whenever the need for 
new ones arises. 

The input record thus prepared is keypunched; a deck of punched 
cards thus arises, from which a magnetic- tape record is created. This 
magnetic-tape record is not, as compared with those input just before and 
after it, in any kind of order; the order of input to the master tape is en- 
tirely random. From this random -order tape we will create the three tapes 
from which the three parts of the catalog will be printed. When the 
master tape has been created from the punched cards, it is read through 
the computer onto the three "write" tapes, each to contain the data se- 
lected from the master tape for the particular catalog function. These 
functions are made known to the computer by the functional area-num- 
bers and other symbols, such as the series-tracing equal-or per-cent. -signs. 

For example, an entry with a personal main entry has {as do all en- 
tries) an Area 10, and when the master tape is read, this functional num- 
ber means to the computer: "Take Area 10, followed (if present) by 
Area 22, and then by Areas 33, 31, and 80, to the author tape." If 
there is an Area 22 (which could occur only with a personal main entry, 
since the computer would not even bother to read Area 22 if the main 
entry were of any other type), the computer gets the message: "Take 
Areas 22, 10, 23, 31, and 80 to the title tape." If Area 2 2 is present, it is 
normal lor there to be no need for tracing the publisher's title (Area 23) 
since a conventional title by definition calls for jee-reference from each 
publisher 's-title variant in the collection. But if this Area 2a were one 
which could not become part of a conventional-title ^-reference struc- 
ture (for instance [Works, dramatic. Selections. English. Smithy it 
might be necessary to trace the publisher's title. This is done by printing 
simply "23" into Area 78, With no Area 22 present, the data printed into 
Area 23 generates the write instruction "Take Areas 23 (up to the non- 
initial non-filing symbol), to, 23, 3 1, and 80 to the title tape." 

Area 31 is then read and written, but does not normally generate 
tracings; but if there is present a series note which has been established 
as trace a ble, the appropriate symbol placed before it will cause tracing. 
If the series note to be traced is in tide form (which we prefer, as we do 
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for serial main entries), it is preceded by the equal-sign, which gives the 
write instruction, "Take the series note, followed by Areas 10, 22, 23, 31 
(without the note being traced), and 80 to the title tape." If the series 
note to be traced is unavoidably in author-title form, it is preceded by the 
per-cent. sign, which gives the write instruction, "Take the series note 
(broken into its author and title components), followed by Areas 10, s'i, 
23. 3i (without the note being traced), and 80 to the author tape." I£ the 
series note is in author-title form with the series-author indicated by his, 
her, its, or their, it is again preceded by the per-cent. sign, which gives 
the write instruction, "Take Area 10, followed by the series note less the 
initial underlined word, and Areas 22, 23, 31 (without the series note 
being traced), and 80 to the author tape." 

Area 70 is then read through; the separate subject-headings traced 
there are each ended with a record-mark (except the last), and each 
generates a separate write instruction, "Take this segment of Area 70, 
followed by Areas 10, 22, 23, 31, and 80, to the subject tape." If the work 
is autobiographical, the number "10" in Area 70 as a separate segment 
generates a write instruction to use the data in Area 10 twice, first as sub- 
ject and then as author. 

The manipulations performed on the data in Areas 76 and 78 is 
similar to that done in Area 70, except that the destination of the second- 
ary entry is in each case a different tape. Shorthand like the use of "10" 
in Area 70 is used here whenever possible to make operation and input 
equally economical. 

Filing Rules 

The three tapes from which to print the catalog have been thus gen- 
erated as automatically as practicable from the one block of "official" in- 
put data. Each tape is now in raw order, and must be put into alphabeti- 
cal order before printing. This brings up the problem of filing rules. 

As mentioned before, we consider a divided rather than a dictionary 
catalog appropriate because of the predictable difficulty involved in pro- 
gramming a body of rules as complex as those for library filing. Secondly, 
we have committed ourselves wholeheartedly to die idea that filing is to 
be interpreted as a function of the symbols whereby the data is communi- 
cated, rather than by general principles of organization which bear no 
relation to the symbolical ion which is the basis for any collation sequence. 

Thus, in the traditional library catalog, not only are all of the Saints 
named John in a group before the Popes named John, followed by all the 
kings similarly named (all of which contravenes the idea of strictly alpha- 
betical filing), but among the kings the organization is not by the symbols 
following the name, but by the country governed. This is possible only by 
the elevation of general principles of hierarchies o£ ranks above die 
principle of collational order determined by symbols. 

Our order of collation, on the contrary, is set up thus: double-blank, 
single-blank, A through Z, zero through nine. Double-blanks are gener- 
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ated by a period, comma, parenthesis, or bracket standing next to a single- 
blank. Thus "London, Jack" would come before "London during the 
great fire," which in turn would come before "London's historic houses." 
A collation sequence which ignored all punctuation would instead file 
these three entries "London during the great fire," "London, Jack," and 
"London's historic houses." Our order of titular names will be strictly in 
accordance with the symbols used, so that "John, duke of Gaunt" will 
Come before "John, King of England," and "Charles II, King of England" 
would come (probably) directly before "Charles II, King of France," 
with both Charles I's preceding them, rather than the traditional anti- 
symbolic order. 

Work Flow: Summary 1 

As a first summary of the flow of the material, observe the outline 
(figure d) showing that the data is input by a cataloger onto Input- 
Record Forms, keypunched into decks of cards, read onto a random-order 
master tape, distributed onto the three catalog-production tapes by the 
functional numbers, sorted one at a time, and printed on the IBM 1403 
printer with its new 88-character chain. It is then reduced by a camera to 
a size easier to use than the huge full-size IBM sheets, and again printed 
(in multiple copies) on an offset press. 

Control Documents 

But into this complicated but straightforward process must be intro- 
duced the controls necessary to lead the user from subject to subject, 
from variant author to established author, and from variant titles to 
conventional title. These controls must also keep the cataloger from 
accidentally using any of the variant forms that are supposed to be re- 
placed by established forms. 

There have been established, therefore, four types of control docu- 
ments. 

Type 1 is used to refer from variant to established subject (x-refer- 
ences and their inversions, see-references), and from one subject to 
related subjects (sa-references). The information input here generates 
references only in the actual book-catalog, since we have in the L C Sub- 
ject Guide a record of our decisions. 

Type 2 is used to refer from variant authors to established ones, from 
variant title-form series to established author-title-form series, from title- 
position anonymous classics to author-position ones, and of course from 
author to author as sa-references. These references are printed out in the 
book-catalog but also generate a booklet for the cataloger listing all 
variants under their established forms, plus each of the variants referring 
to the established forms. 

Type 3 is used to refer from variant titles to established ones, from 
variant author-title-form series to established title-form series, and from 
title to title as sa-references. These references are like Type 2 in that they 
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print out both in the book-catalog as see-references and form an authority 
file for the cataloger containing sa-, x-, and see-references. 

Type 4 is used for reference from variant titles of a single work to a 
conventionalized form. It also carries standard subject-headings for that 
work, which generate blanket references in the subject catalog to the 
appropriate places in the author and title catalogs where all the editions 
will be found. 

Each of these types of control documents (except Type 1) thus serves 
a double function: they create a computer-assimilable record of a de- 
cision for the use of the cataloger, and a guide for the user to the right 
place to find what he is after — all automatically from the very document 
on which the cataloger records his decision. 

Work Flow: Summary 2 (with Controls) 

It can be seen, then, that before the computer-print step in our earlier 
flow of materials can be made, we must merge in these separately-pro- 
duced control tapes. The author-catalog printing then results from the 
sort of parallel flow shown in figure e. There is, of course, a similar 
operation in parallel for the production of each of the other tapes 
(figure f). 

Conclusion 

Moving from detail Lo a broader view, we find, first, a hope that by 
these means we can radically release a number of typists and filing-clerks 
from humdrum and consequently error-laden tasks, laying these drudg- 
eries instead upon a machine that can produce entirely faithful embodi- 
ments of our decisions and file them by inflexible programs rather than 
vague remembrance ol principles, without ever tiring or relaxing. Second, 
we retain all that the traditional catalog has to offer except unlimited 
space. We must conserve space for the com [niter to operate really effi- 
ciently; but we have actually added information we could not hope to 
find in the traditional catalog and have added this information in a way 
that makes possible a use of (he computer to search the tape-stored catalog 
in a way that can be a great and fruitful aid to reference and research 
(rather than saving information just for its own sake, never to be acces- 
sible in any truly helpful way). 

Traditionally, what we want in a library catalog is accurate informa- 
tion handily arranged and pleasantly intelligible, capable of flexibility 
to harmonize with each sort of document, and able to be kept almost 
instantaneously current. The card-catalog does these things, sometimes, 
or at least some of them. The book-catalog does them, too, and opens the 
way to new and challengingly promising functions. And (we sincerely 
believe) it can even beat the card-catalog on its own field, in almost all 
of the areas of comparison: accuracy, handiness, intelligibility, flexibility, 
currency. 
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Automatic Classification and Indexing, 
for Libraries?* 



Donald V. Black 
Consultant on Automation and 
Information Science, The Library 
University of California, Los Angeles 



THE TITLE of this paper ends with a question mark to indicate that 
I am not suggesting that automatic classification and indexing is a 
fait accompli. Probably I will raise more questions than I answer. 

The first thing that any speaker ought to do is to be sure that his 
audience understands the title of his talk. There is a certain amount of 
imprecision in my title today, and I am going to exploit this by making 
up a definition to suit my purposes. I plan to use the two terms "classifica- 
tion" and "indexing" somewhat interchangeably, contrary to normal 
library practice. For one ("classification") means the placing of written 
works, of whatever nature, into subject classes, whether or not these sub- 
ject classes have any systematic order, i.e., generic-specific relationships. 
By "indexing" I mean the assignment of subject terms to any kind of writ- 
ten material. Notice the difference. "Classification" is concerned with sub- 
ject headings, and "indexing" is concerned with subject terms. What I 
intend to do, then, is to use the word "classification" in dealing with 
typical library-type materials. These would be conventional books and 
other printed materials, which would normally go into a library of the 
traditional type. "Indexing" I intend to use in connection with the type of 
printed materials we now tend to call "technical report literature," or 
other such written communications, which are not so widely held in 
conventional libraries and for which subject headings in the usual, tra- 
ditional library practice are very rarely used. Now, in my opinion, the 
methods which I intend to discuss are the same for both automatic classi- 
fication and automatic indexing. Thus, it is of relatively small importance 
whether at one moment I talk about classification or about indexing, 
since the processes which I am going to discuss are equally applicable to 
the one or to the other, in my opinion. 

At this point, it is perhaps useful that I should explain my personal 
philosophy of librarianship. There seem to be, currently, two camps 

* Opinions stated herein are those of the author and do not necessarily represent 
those of the UCLA Library Administration. Mr. Black is now Head, Technical Services, 
University of California, Santa Cruz. 
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among librarians. One is perhaps best characterized as being completely 
bedazzled by the glittering chrome on the computing machinery: the 
flashing lights, the whirling tapes, the stream of punched cards flowing 
in and out, etc. The other camp is frequently characterized as hiding 
behind a wall built of moldy, old tomes, where their councils meet to 
decry the advent of mechanization and to pass from hand to hand a 
favorite volume, so that all may feel the pages, admire the binding, smell 
the ink on the page. From time to time, one of their number will mount 
a stack of books to look over the battlements, to view the advancing 
column of computers, and to shout imprecations against those who seem 
to be guiding the oncoming machines. Librarians seem to regard the 
computer as either an instrument for their oppression or as one for their 
liberation. Now, I have never cared for any system of logic which allows 
for only two states. Thus, I hope that I am representative of some third 
school, or middle ground, or what you will, which can see the computer — 
or any device for the mechanization of processes — as a liberating instru- 
ment on one hand, as well as an instrument for oppression on the other, 
and can, hopefully, maximize the liberation and minimize any oppres- 
sion. This is a tall order for anyone, and I am not sure that I am man 
enough to do anything to further this somewhat pious hope. The situa- 
tion, though, is often not as bad as it seems. What may seem to be op- 
pression may be, in reality, liberation, if we are but honest with ourselves. 

Now, to the actual discussion at hand. Probably the first hint of a 
system of automatic classification or indexing was contained in an article 
by H. P. Luhn in the IBM Journal of Research and Development, 
October 1Q57- 10 This was entitled "A Statistical Approach to Mechanized 
Encoding and Searching of Literary Information." Luhn followed this 
germinal article by many more over the years in one place or another; 
however, his basic technique of using statistical methods did not change. 9 
He is responsible, however, for the modern day technique of permutation 
indexing, although it was a librarian, Crestadoro, who first suggested this 
back in the 19th century. 3 Evidently the basic idea of Key-Word-In-Con- 
text, as the permutation indexing systems have become known, occurred 
to a number of individuals at about the same time. At least one of them, 
Herbert Ohlman, then of the System Development Corporation, was 
aware of Crestadoro's initial idea behind the permutation indexing tech- 
nique. Evidently Luhn, however, was not aware of any predecessor in this 
area, and to him and the IBM Corporation must go full credit for 
developing the KWIC system into what it has become today. 

Phyllis Baxendale, also of IBM, suggested 2 some techniques which 
could produce an automatic index of journal articles, provided that the 
complete text were available for machine input. These differed from 
Luhn's techniques in being based on the prepositional phase rather than 
a statistical count of the frequency of words used. 

Both of these systems, developed by IBM personnel, can be catego- 
rized as, basically, operating on the materials at hand, without a large 
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body of computer-stored information to aid them. The Key-Word-In- 
Context System and also Baxendale's "phrase system" both operate using 
a relatively small store of words in computer memory. In the KWIC 
system, the only words which must be stored are those which are deemed 
non-significant and which will not be used as key words. In the Baxen- 
dale system, the only words which need to be stored are the prepositions 
and all words which are to be excluded from further consideration. 

Obviously, what I am saying here is an over-simplication, and I trust 
that the experts will not take me to task for neglecting to mention some 
of the finer points. However, it does not seem necessary, in this sort of 
introductory discussion, to go into these details. Furthermore, I am per- 
sonally of the opinion that the Key-Word-In-Context approach has been 
developed about as far as it can go, and that while it can, indeed, be use- 
ful in many library functions (which I will not discuss further here) we 
can look to it for no more new or startling developments, and its future 
exploitation will bring us no nearer our far-distant goal of perfect control 
over printed materials. 

The next point in our history which I want to discuss is the article by 
Don R. Swanson in the October 21, i960 issue of Science 20 which was 
entitled "Searching Natural Language Text by Computer." This article 
reported on work which began considerably before the date of publica- 
tion and was carried on for some time thereafter. Before proceeding, I 
want to make it clear that I consider myself a disciple of Swanson. I 
worked on this early project with Dr. Swanson and others, and very 
recently I have tried to implement a system which is based upon his ideas. 

For those of you who are not familiar with this early experiment, 
perhaps it would be worthwhile briefly to recapitulate it, since it does 
form the basis for what I believe to be a workable system of automatic 
classification and/or indexing. This early experiment by Swanson pitted 
human indexing and retrieval against machine retrieval, using material in 
nuclear physics as an experimental library. Note that there was, initially, 
no special attempt to produce indexing or classification by means of the 
computer in this first experiment. The computer was to have available to 
it the full text of the experimental library. Thus, it was felt that no in- 
dexing, as such, was necessary. 

A group of individuals, all of whom held Ph.D. degrees in nuclear 
physics, was assigned a number of issues of the journal Physical Review 
which was to form the experimental library. There were 96 articles in the 
field of low-energy nuclear physics which were used as the primary ex- 
perimental body of materials. Each physicist produced a series of ques- 
tions which were made up from the articles which he was assigned to 
read. These were examination type questions. I will quote one or two of 
them here for illustration. For example, "What is the best available value 
for the electric quadrupole moment of the deuteron?" Another: "What 
nuclear reactions are sensitive to the spin and parity of mesons, and hence 
are useful in measuring those quantities?" One last example: "Are 
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bevatron neutron beams monoenergetic?" The articles which elicited 
these questions were termed "source articles." 

The entire group of physicists then examined the experimental library 
with all the questions in hand, and decided which articles had pertinence 
to each question. This pertinence was rated on a scale of one to ten. 
(There is a technical difference between pertinence and relevance, which 
it does not seem necessary to observe here. Suffice it to say that sometimes 
an item may be relevant but not pertinent. Ingenious readers can con- 
struct their own examples.) In the first experiment, a completely separate 
group of nuclear physics articles, taken from Physical Review and some 
other physics journals, was examined, and on the basis of this a system of 
subject headings was created. This was not a classed system but followed 
(more or less) Cutter's rules for alphabetic subject headings. 4 

It must be emphasized that the questions which were to be asked of 
the system as a test of retrieval effectiveness were not used in preparing 
the initial subject heading list. In this respect, the experiment certainly 
follows normal library practice. That is, the catalogers have no idea who 
is going to use the products of their efforts, how these patrons will ap- 
proach the catalog, or in what way they will express their needs and 
wants. There is some question, of course, whether or not these exam-type 
questions would ever he asked of an information system in the real world. 
Personally, I do not think so, although there may be a small percentage 
of the total list of questions which could occur in a real-life situation. 

The reasoning behind this experiment was that if the computer had 
the complete text available to it, no classification or indexing would be 
necessary for the documents; and that the computer, having the full text 
available to it for search, could retrieve material more successfully than 
a human who would have only a catalog to search. As you know, this 
hypothesis was tested, and under the conditions of the experiment it was 
found to be true. The human retrievers found only a fraction of the 
material which the computer was able to retrieve. Interestingly enough, 
it was also discovered that neither system, either separately or jointly, 
retrieved every document known to be pertinent to a given test question. 
The group of people who actually performed the retrieval experiments 
was non-homogeneous. There were mathematicians, physicists, computer 
personnel, and librarians. None of these people had seen the test ques- 
tions prior to performing the retrieval experiments; they were not in- 
formed of the results until after their work had been done. 

At the end of the first experiments, the results looked something like 
this: For the conventional subject heading search, of the maximum 
percentage of source documents (those documents which elicited the 
questions in the first place) the maximum percentage retrieved was 36%. 
The best computer search retrieved 84% of the source documents. For 
each question, the average number of relevant documents known to be 
in the experimental library was 6.9, and the average number retrieved in 
the conventional search was 1.2, while the number of irrelevant docu- 
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ments retrieved averaged 2.6. (See Table I.) Thus, for approximately 
every one relevant document that was retrieved, two irrelevant docu- 
ments were retrieved in this conventional subject heading search. (N. B. 
All figures exclude the source documents, except where they are specifi- 
cally mentioned.) Results of this first experiment were reported in the 
above mentioned article but no widely-available report, at least in the 
library literature, was made on the second phase of the studies. 21 

One of the criticisms of the first study was that the conventional sub- 
ject headings had been made up without any idea of even the type of 
question that might be asked. Therefore, in the second phase the entire 
list of questions was reviewed, and new subject headings were created. 
Then the entire experimental library was re-indexed using the new list 

TABLE I 

RETRIEVAL RESULTS ON NUCLEAR PHYSICS LITERATURE 
A COMPARISON OF HUMAN SEARCH VS. COMPUTER SEARCH VS. 

TITLE SEARCH 




1 * 3 k $ 



I 

R=: Number of relevant documents retrieved 
1= Number of irrelevant documents retrieved 
SD = Source Documents retrieved 
C= Conventional search, Phase I 
C'= Conventional search, Phase II 
A= Computer search aided by Thesaurus 
T— Title search 

NLS= Computer search with cutoff to retrieve 100% of Source Documents 
Note: all results are averages per question 
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of subject headings. There was considerable time span involved in these 
activities, so that probably there was very little, if any, carry-over from 
one experiment to another. Also, the indexer was not a physicist, and 
thus little carry-over normally would be found. 

On the basis of the second experiment, it was discovered that the 
review of the materials had improved the retrieval effectiveness in that 
an average of 1.3 relevant documents — as opposed to 1.5 irrevelent docu- 
ments—was found for each question; and the number of source docu- 
ments retrieved improved also to 54% as opposed to 36% in the first 
experiment. The magnitude of the improvement is really most noted in 
the fact that the number of irrelevent documents per relevant document 
was reduced by 50%, approximately, although the actual number of 
relevant documents retrieved was not increased in any significant degree. 
By allowing more irrelevant retrieval, the retrieval effectiveness of the 
conventional search was improved up to a maximum percentage of 56% 
of the relevant information known to be available in the library, and 
88% of the source documents. At the same time, however, the percentage 
of irrelevent retrieval went up from 54% to 77%. For the computer to 
retrieve 100% of the source documents, it was necessary that 62% irrele- 
vant material be retrieved, and an overall total of only 53% of relevant 
information known to be in the library was obtained. As you see, even 
the computer wasn't too successful! 

At a subsequent period, the titles of the articles in the test "library" 
were reviewed, and retrieval effectiveness of the titles alone was tested 
against the questions. It was found that the titles alone retrieved 93% of 
the source documents. However, since these source documents had sug- 
gested the questions to begin with, the language of the title may well have 
had some bearing on the language used to form the question. Thus, 
source document retrieval was not considered as indicative of the effec- 
tiveness of titles in this respect. However, the titles alone did produce 
33% relevant retrieval* (non-source documents) at a penalty of only 
50% irrelevant retrieval.** We will have more to say about the titles 
presently. 

The members of Group Two, mentioned above, performed retrieval 
experiments on the Experimental Library. All the members of this group, 
except the librarian, constructed search questions which were presented 
to the computer. The first experiment depended solely on the knowledge 
of physics of the men involved and in their knowledge of the manipula- 
tion of retrieval questions within the computer. 

Using the full text of the Experimental Library, a "Thesaurus" was 
constructed by the computer which sought to link together synonyms, 
near synonyms, word associations, etc., which might be of aid to the re- 
trievers in their experiment. This was produced in printout form on the 
computer and was available to each individual for the second retrieval 

* That is, 33% of the known relevant documents were retrieved. 
** That is, 50% of the documents retrieved were judged irrelevant. 
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experiment These individuals were not aware of their successes or fail- 
ures in the first experiment. 

that R ^rJ^ an , give any , more details alon e these Iines - 1 wil1 onJ y ^d 

S oosll H ^ ^ u mSUre the ex P™ent's being as objective 
If fi V ' ^l thmg WhlGh COuld conceivably influence the results 
was farmly controlled. 

=m A f ! Cr 1116 I™," 11 com P uter sealth - w hich was aided by the Thesaurus, 
Jnd as |>a*t of the second phase of the overall experiment, a detailed 
^animation was made of the results of the two experiments. That is all of 
Aft Mtmi exponents, both by humans directly and by use of the com' 

t0 diSC ° Ver ' ft P ° SsibIe ' the reasons that ret rieval effec- 
tiveness was , m general, so poor. Despite the fact that all of the individ- 
uals who were involved in the computer retrieval experiment were 
competent and held advanced degrees (some in nuclear physics), it wa 
d 1S covered that the search instructions which were formulated for the 
computer exhibited elementary oversights and very few exhibited any great 

SulTnmhTT? ° r \ nSight - 11 WaS dedded that search instructions 
could probably be formulated as well by the computer itself; at least the 

W^r^r e f ibit greater consistency than had the human beings. 
We will return to this point later. 

mm span of three or four sentences of one another, they were much 
more likely to be related conceptually (within the context 5 22 
than if their proximity were greater than this three or four sentence span 
2ur™ POnentS COOrdinate indexi "g techniques have used mere co- 
22L*J?3 ° f thCir S> ' StemS f ° r retrievaI: and --occurrence is 
T*SZ?t t0 ™ ° l T V T tC COl,pHng - Such devices as " role indicators" 
which TL SBFP * sl ren g then Ae coupling between terms 

Z Tf^ tOT retn " eVal ° n a ^ordinate basis, but coor- 

an li v 1 r g St IOSCS a grCat dCal ° f information because it lacks the 
2 m [ S >' mactIcal ^education. There have been recent ex- 

5555 an f T* l ° indk3te th3t " r ° IeS " and " Iinks " do not really 
provide any useful measure of syntactical specification.! However prox 

Zltr Se f m '. in , some /ases at any rate, to provide a practical 'sub- 
stitute for syntactical specification. 

Another interesting fact, which seemed to be indicated from these 

SSn wT- tS ' T that ^ freqUenCy ° f occurrence of a natural language 
term within a document of whatever kind is not necessarily related to its 

oeenT'i" sto ^ or retrieval. This assertion has not 

been as strongly indicated in some later experiments as it was in the 

2rSr£ mVoWmg RUcIeal " P h y sics literature. Yet it would seem to cast 

o£Lt^L°l* U T atl r U, u eXing technic i ues w hich use frequency of 
occurrence as their basts for the decision that a given term is representa- 

lvi,^ CO "f m ° f a docum ent-that a term is a "keyword" because of 
its liequency of occurrence. 
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We must now move along to consider the present state of human 
classification and indexing. You will recall, I hope, that I mentioned 
above that at one point in phase two of the Swanson experiment, titles 
alone were used to retrieve documents to answer the examination ques- 
tions of the experiment. Titles alone retrieved 93 % of the source docu- 
ments but only 33% of non-source documents known to be relevant to 
any given question. Now, a number of people have been interested in the 
use of titles as indicators of subject content. As far back as i960, a group 
at IBM ran an experiment comparing four types of lexical indicators of 
content. An article describing those experiments at IBM appeared in the 
iournal Human Factors in August 1960," under the title of A Re-Evalua- 
tion of Machine Generated Abstracts." I should like to quote the sum- 
mary to that article: "Twenty-five subjects were divided m to five groups 
matched for their reading speed and score, on a sample cntenon test. 
Each group was given a different kind ot lexical indicator of content to 
a set of seventy-five documents, namely: titles, three types of abstracts, and 
complete texts. In addition, they were given an examination composed oi 
seventy-two short-answer type questions, which were derived from fifteen 
out of the seventy-five documents. They attempted to answer the ques- 
tions and evaluate the relevance of each document in answermg the 
examination." This experiment seemed to indicate that the task ot 
determining whether documents are relevant or not to some given pur- 
pose can be performed by any one of the four types of lexical indicators 
with about equal results. . 

A subsequent experiment was reported by Resnick in Science for 
October 6 1961 « under the title "Relative Effectiveness of Document 
Titles and Abstracts for Determining Relevance of Documents This 
experiment reported a test undertaken as part of a system of Selective 
Dissemination of Information, wherein individuals were asked to deter- 
mine the relevance of documents to their interests on the basis of titles, 
and on the basis of abstracts. The results of this test seemed to indicate 
that there was no significant difference between the usefulness of titles 
and the usefulness of abstracts for such a purpose. 

Some quesion may be raised about both of these tests. The numbers 
of people involved were small: in the first case, only twenty-five individ- 
uals and only seventy-two articles; in the second case, 400 documents and 
400 individuals were used in the test. However, the 400 documents were 
separated into two groups of 200 each. I do not believe the numbers m- 
volved in the first test, since they concern human activity, are really 
significant, statistically speaking, despite all the statistical formulae to the 
contrary Of course, the same criticism can be levied against the Swanson 
experiments involving nuclear physics articles. The primary objection 
to the second experiment revolves around the fact that the notices con- 
tained the author's name in all cases, and, quite likely, many individuals 
selected items of interest on the basis of authorship rather than anything 
else. Subsequently, Rath et al. devised another experiment to overcome 
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StfftftS 5 SCemed t0 Sh ° W ' als °' that titles were J" ust about as 
effective in determining content as abstracts or even full text 

mmk i» i960, published an article entitled "Value of Titles for 
Indexing Purposes."" He found ^ ^ ^ contained abom 

ot the teims under which they were indexed. 

HtafS ^r 8 ? thCSe ex I )erimenCs ma Y ^1 have flaws, and I do not say 
bat they do, they provide us with indications of trends which may be 
niruier studied and verified. 

In October 1962, an article entitled "Machine-Like Indexing by Peo- 
ple was published in American Documentation.™ This report, by Mont- 
gomery and Swanson, studied what was actually being done in a system 

US fl T n mdeXerS - Let mC em Phasize what Swanson and Mont- 
gomery themselves emphasized in their introduction but which was 
S I " Verl(X3ked ^ so ' n< : librarians. The study was not concerned with 
hwj g lven number of journal articles could have been indexed, or 
should have been mdexed, but only with the end-product, namely a 
i llf the ^1 concerned with what the individuals 

m ^ of their out P ut - but only with what was 

This experiment did not involve evaluation of retieval effectiveness 
01 any system, but rather it was based on the question, and I quote "To 
what extent can the human indexing operations that take place in an ev 
rstmg system be simulated by machines'- Thus, the question of whether 
or not a g,ven system is effective for retrieval is by-passed Briefly stated 
the study examined the September .960 issue of mm Mnlicus. The 
system oi subject headings which Index Medicus uses is similar to the sub- 
ject headings used in regular library practice in that it assigns journal 
ju tides, m tins case, to pre-established subject categories, just as most 
Iibranes assign books to pre-established subject headings. The study was 
undertaken to see how many of the assignments could have been made 
fk&m an exammatmn oi the titles alone. None of the results were in- 
r en T ™° I ' n ° r WaS k Stated ' that t[le i >eo P Je th at do the indexing for 

ftudkd fuTt aCtUa !li y CXamined ° nIy thC dtleS - S ° me 4.77o entries were 
studied. Matches could occur, or were considered to occur, in two ways' 
one, the exact equivalent words were to be found in a title, or two, some 
synonym of the subject heading was found in the title. (This is a great 
over-simplification; however the complete details are readilv available 
itS;-' In \ ma ? me Procedure, of course, such a table of syno- 

STn^S CI Tu StOTed in mychine mem01 -y- so that the Ach- 

ing process would be automatic. Matches were found in 400. entries 

There were i 49 doubtful, and the remaining 538 did not comain won h 

synonymous in any way with the subject headings 

The results of this exercise suggested that it would be interesting to 

UC^ e RioM^ SPe i 1 T^ bibn °^ aphieS Whkh had been com P iled i" the 
Bio-Medical Library, in response to requests for information. A 
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sample of 83 special bibliographies was examined. Once again the exami- 
nation was not to determine the actual relevance of the articles listed m 
each bibliography, or how they were actually produced, but to determine 
if they could have been produced by a computer horn an examination ot 
the titles alone, using some sort of a machine dictionary or thesaurus as a 
subiect authority list, cross-reference guide, and synonym hnder. Ot the 
total of s,i45 bibliographic citations evaluated, only 5.3% did not contain 
at least one term which corresponded to the terms or concept, expressed 
by the requester in his question. 

O'Connor recently published in American Documentation, April 
1964,1* the results of a study of the correlation of index headings and 
title words in three medical indexing systems. O'Connor had raised some 
doubts concerning the validity of the Montgomery-Swanson study at the 
annual meeting of the American Documentation Institute in November 
1061 In the interim period, he had been studying the problem of corre- 
lation of headings and title words, and this recent article represents his 
current viewpoint on the matter. 

I will not go into the details of this study, except to indicate that the 
three systems studied by O'Connor were: (1) volume 2 of the Index- 
Handbook of Cardiovascular Agents* (2) documents indexed in 1955 tor 
the Merck Sharp and Dohme punched-card retrieval system; and (3) 
the 1961 National Institutes of Health Research Grants Index.™ Not a 
very great number of subject heading-title pairs were used by O Connor. 
He recognizes the limitations inherent in such a small sample, but he ieels 
that the small sample was large enough to permit some generalization. 

He found that for the first system there was a 32% positive relation 
for the second system there was a 54% positive relation, and for the final 
system there was a 26% positive relation. This is not especially encourag- 
ing if one hopes to use titles as a means of automatic indexing or ab- 
stracting of printed materials. However, O'Connor does seem to make 
some mistakes; at least, it would appear to be the case from the samples 
which were printed in American Documentation. I will mention only 
one Under the heading Nephritis the title "The Role of the Kidney in 
Protein Metabolism" is not considered to have any correlation with the 
heading However, the word Nephritis comes from the Greek word 
meaning kidney, and it would seem that a set of rules could be con- 
structed which would tie the two words together. There are a number ot 
other instances which appear to me to indicate the possibility of auto- 
matic processing, were a machine thesaurus of reasonable complexity 
and quality available to the computer. 

Another interesting piece of work contrasting human concept index- 
ing with machine indexing is reported in the April 1964 issue of Ameri- 
can Documentation.™ M. J. Ruhl, under the title "Chemical Documents 
and Their Titles: Human Concept Indexing versus KWIC-Machine 
Indexing," reports a study which compared the indexing of identical 
documents appearing in Chemical Titles and in the Chemical Abstracts 
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Subject Index. (Chemical Titles is a publication made up solely of a 
key-word-in-context index prepared by machine, plus author index.) 
This study showed that more than half of the titles included all concepts, 
or their equivalents, as indexed by Chemical Abstracts. This is a par- 
ticularly interesting and important study, since for many years the Chemi- 
cal Abstracts Subject Index has been the standard against which others 
were compared. The work reported in this study was actually done from 
September through December 1961. Since that time, several changes 
have been made in Chemical Titles format, editing rules, and word omis- 
sion rules. It would appear from the changes which have been made that 
were a study made now, the title might be even better in indexing every 
concept than it was at that time. 

It is only fair to point out, of course, that the human indexing done 
on these documents was done from the abstract and not the full text of 
the article. It may well be that the indexers relied too heavily on the 
document titles for their indexing, neglecting the abstract text; or the 
abstractors relied too heavily on the titles in preparing the abstracts. The 
general point of the article was that authors and editors must be con- 
sciously aware of the increasing importance of the title, and, indeed, this 
is a good point. 

At one time I was in charge of the Physics Library at the University 
of California, Los Angeles. Intrigued by some of the earlier title studies, 
I selected all of the significant additions to the Physics Library collection 
made over the period of a year and analyzed the titles of these materials 
10 see whether or not the subject headings which were assigned by the 
catalogers could have been assigned by a machine process were a reason- 
able thesaurus available to a computing machine. 

The Physics Library was producing a monthly list of new accessions. 
It was from a year's cumulation of these lists that the titles which I 
examined were taken. To be included on the list an accession had to be 
considered a significant addition to the collection. Such things as added 
copies of titles already in the Library, older materials of little interest 
other than historic, and elementary physics textbooks of a high school or 
junior college level were excluded. Also excluded were certain items 
which were purchased solely for the use of the library staff of the 
Physics Library, i.e., materials which were to aid the library staff and 
were, typically, not to be used by the staff and students of the Physics 
Department. 

Before describing Table II which illustrates my findings, I must de- 
fine one term: the term conceptual equivalent. I have chosen to use this 
term and to create a definition of it for use in connection with Table II. 
Others have used varying terms to mean, more or less, the same thing. 
For example, O'Connor 18 used the term "synonym inclusion," and his 
conditions for such were as follows: (1) a single word heading is identi- 
cal with a title word: e.g., aedes and "aedes;" (2) a single word heading 
is an inflexional variant of a title word: e.g., aggression and "aggressivity;" 
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(a) a single word heading is (reported by an audioritative dictionary to 
be) a synonym of a title expression: e.g., aging and "senescence;" {4) a 
word which is part of a multi-word heading, and occurs in no other 
heading, is related to the title by synonym inclusion: e.g., tranqmhzmg 
agents and "tranquilizing" or a synonym; (5) each word in a multi-word 
heading without unique heading wo ids— in the sense of number (4)— is 
related to the title by synonym inclusion: e.g., stomach neoplasms and 
"stomach neoplasm." 

I have used the term "concept equivalent" to mean the type of 
synonymous equivalents which one would find in a subject heading 
authority file: e.g., the term NUCLEONS as listed in a subject heading 
authority might say "see PARTICLES (NUCLEAR PHYSICS)." A term 
such as RADIATIVE TRANSFER might show "see RADIATION." In 
addition, there are such equivalents as "conference" and "congress" 
which, for all practical purposes, mean the same thing in most printed 
proceedings. The term "antholgy" is clearly equivalent to the two-term 
phrase COLLECTED WORKS which is a part of a form subdivision. A 
phrase such as "i n forma tion for the engineer" can be considered to be 
equivalent to the heading ENGINEERING, and the phrase "men of 
science" can be considered equivalent to the heading SCIENTISTS. 

The other studies of titles were concerned with journal articles. Sub- 
ject headings used in indexes and abstracts for journals differ, in many 
respects, from the subject headings used in an academic library. For 
example, there are many "form" headings used in library catalogs, and 
they are relatively rare in periodical indexing. DISSERTATIONS, 
ACADEMIC which is then subdivided by the institution and the depart- 
ment, and the subdivisions such as ADDRESSES, ESSAYS, AND LEC- 
TURES or COLLECTED WORKS are examples of headings used at 
UCLA. These are all commonly found in regular library cataloging or 
classification, but they are not generally a part of the problem of pro- 
ducing subject indexes to journal articles. Another phase of library cata- 
loging which is not found in the normal journal index is the corporate 
author. In some cases, the combination of corporate author and title may 
well indicate something of significance in the automatic production of 
subject headings. For example, the proceedings of a conference on vac- 
uum technology would form a corporate entry in a regular library cata- 
log, and the title alone, which might be "Proceedings" or "Transactions," 
normally would be meaningless in an automatic system. However, the 
combination of the corporate author plus title could well be used to in- 
dicate the exact subject heading, in this case VACUUM-CONGRESSES, 
which is given to such material. 

In Table II, materials are divided into five classes. The first class 
(Equal) is made up of items for which words in the title were exactly 
matched by the subject headings chosen by the cataloger, or were con- 
ceptually equivalent (as defined above) to the subject heading or head- 
ings chosen. In many cases there were multiple subject headings, and to 
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be included in this class, all subject headings had to be matched exactly 
by words in the title. 

The second cateogry of Table II is No Match, which is used for those 
title-subject heading pairs which are not clearly enough connected, so 
that a simple process utilizing computer look-up in a machine thesaurus 
probably could never produce automatic subject headings assignments. 

TABLE II 

A Comparison of Book Titles and Subject Headings, 
Taken from a 12-Month Period of Acquisitions for 
the Physics Library of the University of California, Los Angeles. 

1961-62 

Category! Number of Titles % of Total? 

1. Exact or equivalent 147 

2. No match 72 i 5 

3. Probable match 15 i 37 

4. Special problems ig . 

5. No headings 21 7 

Total number of Titles 410 

1 For explanation of categories, see discussion in text. 

2 Percentage figures are rounded off. 

The third category is Probable Match, which is used to indicate that: 
(1) either the subject headings actually assigned are not exact matches, 
but are still probably "conceptually equivalent"; or (2) that where there 
were multiple subject headings assigned to a given title, only one or two, 
but not all, of the headings matched. Many of the items in this category 
actually had perfect matches of perhaps one or two subject headings. Yet 
there were more subject headings assigned, and the remainder did not 
contain matching words or matching concepts. 

The two final categories are really equivalent to the second type, that 
is, No Match. I have arbitrarily divided these two into (4) Special Prob- 
lems, and (5) No Assignment Made. Special Problems indicates items 
which were too difficult conceivably to be assigned by machine processes 
but which had elements of further interest in them for detailed analysis. 

Let us consider the title "Statistical Theory and Methodology in 
Science and Engineering." This title was given two headings: (1) 
MATHEMATICAL STATISTICS, and (2) EXPERIMENTAL DE- 
SIGN. Now a knowledgeable human can see easily the relationship be- 
tween these subject headings and the words in the title; yet looking at 
this from the other side, it seems doubtful that even a very knowledgeable 
human would have been willing to suggest these two assignments with- 
out, at least, looking a little further than just the title. At any rate, these 
particular items were interesting enough to me to be put into a special 
category for further study and analysis. 

The final group, those with "No Headings Assigned," are all disserta- 
tions for the doctoral degree in physics. Each of these was assigned a form 
heading, i.e., DISSERTATIONS, ACADEMIC— UCLA— PHYSICS by 
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the Catalog Department, and sent 10 the Physics Library for purposes of 
subject heading assignment at a later date. These items were all so diffi- 
cult (i.e. esoteric) no regular subject headings were assigned by the time 
, Key were entered on their respective New Booklists. Subsequently, they 
were given some sort of subject heading, but the headings given were not 
necessarily those used by die Libra 17 of Congress or the UCLA Catalog- 
ing Department, and they were assigned in the Physics Library and not 
by the regular calalogers. Thus, they were not included in this study. 

What does all of this mean in terms of automatic indexing as it now 
stands? We must ask ourselves, first of all, what is it that we are attempt- 
ing to produce when we classify or index? We are attempting to produce 
a product which is a representation of a body of knowledge. We must 
have this representation, abbreviated though it may be, since we cannot 
examine each and every item in our collections. This latter, of course, 
would be the only sure way of determining whether or not each and every 
document did or did not contain something relevant to a given individ- 
ual's need for information. Even in one's own personal collection of books 
and journals, it is not possible to examine each and every item every time 
one needs to find some bit of knowledge. Thus, our products are for the 
purpose of presenting a searchable representation of our library collections 
Enough research has been performed to indicate that any system of 
subject headings, coordinate index terms, machine retrieval, whatever, 
which continually makes its specifications narrower, in order to eliminate 
irrelevant material, will be subject to the error of over-specification. This 
error will cause relevant material, to a greater or lesser extent, to be 
missed. The problem, then, is not to find a solution which will be theo- 
retically perfect and which will prevent all irrelevant retrieval as well as 
produce all relevant materials in any given collection. For such a solution 
exists onlv as a rather fantastic dream. Rather, our goals should be to 
produce a' system which will minimize irrelevant retrieval and maximize 
relevant retrieval to the greatest extent practical. The emphasis should 
be on the word practical in all cases. 

Actually we are operating very much in the dark. Justification of 
any system must ultimately be based on successful retrieval, but how can 
success be evaluated within an existing collection of any size which is in 
operation on a daily basis? It cannot. Success can only be evaluated m 
terms of a closed system, that is, a system wherein sufficient knowledge is 
available of the entire contents of the Library, so that evaluation may be 
made of various techniques. And, unfortunately, there is no way of know- 
ing whether we have retrieved all pertinent or relevant materials in a 
real situation. It is impossible to know, since it is impossible to be familiar 
with every item in a collection. 

We might now ask, why should anyone be interested in automatic 
classification and indexing? There are many reasons. For example, many, 
many libraries have large arrearages in their catalog departments. This 
would seem to indicate that they do not have enough catalogers to pro- 
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cess the materials which are coming into the system. Also, in all systems 
utilizing human beings as classifiers and indexers, the success with which 
information can later be retrieved depends critically on the care and ef- 
fort expended in analyzing the materials as they are entered into the 
system. Thai liuman beings do not perform optimally at all times is well 
known, but frequently forgotten. Indeed, the level of human performance 
is quite variable from one period to another. This point is covered in a 
recent article by DeLttria which appeared in the April 1964 issue of 
American Documentation 6 and was entitled "Index-Abstract Evaluation 
and Design." We frequently tend to forget that human beings have daily 
cycles of efficiency, that individual classifiers have varying abilities de- 
pending on the subject matter they are indexing, and that outside pres- 
sures frequently cause a decrease in efficiency. In comparing machine 
processes with human manual manipulations, there are some who in- 
variably show the human system in its best light, as if every individual 
involved were a veritable paragon of intelligence and virtue: one who 
has never made a mistake in his life, knows his job and subject matter 
perfectly, has learned (and remembers) every policy memo sent to lum 
(and to everyone else in the organization), has always shown acute in- 
sight as well as considerable foresight, and can work tirelessly, at peak 
efficiency, 365 days of the year, at least 28 hours a day. The fact: people 
come and go, are sick, die, are replaced by new employees, must be 
trained, do not always work at peak efficiency, take vacations and coffee 
breaks, etc., may walk off with large quantities of the institution's supplies 
and equipment, may suffer memory failures — in other words, they are 
human. None of this seems to bother these foes of mechanization. 

Experiments have been performed that seem to show that automatic 
indexing or classification can produce usable, reliable, and, above all, 
practical results, provided that some measure of machine readable and 
interpretable input is available. A machine system for classification or 
indexing depends on the ingenuity and resourcefulness expended upon 
the development of the system, as well as the continued proper operation 
of the machinery that is to be a part of the system. Mechanical and elec- 
tronic reliability is now specifiable beforehand (in terms of mean hours 
before failure), whereas human reliability cannot be predicted. Provided 
that overall effectiveness is nearly equal, the system that depends less on 
the human element would clearly seem to be more desirable from a 
standpoint of reliability and efficiency, and perhaps even from a stand- 
point of economics as well. 

It is perhaps a truism that input is more important than output, for 
if there is failure at the output end, i.e. in retrieval, one can always try 
again. But if there is failure at the input end, the material is most likely 
irretrievably lost. In a system utilizing human classifiers and indexers, 
overall performance of the system, particularly for retrieval, is a direct 
function of input quality. The quality of input is, in turn, a direct func- 
tion of four elements concerning personnel: (1) availability of qualified 
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individuals; (2) their knowledge of the system by which they will index, 
i.e., the system of classification or subject headings; (3) quality of their 
training; and (4) continuing coordination and auditing of their work. 

In a mechanized system, only one of the above four items is a factor, 
i.e. the continuing coordination and auditing of the system performance. 
The other three factors are taken care of once and for all at the beginning, 
during the establishment of the system. Subject heading authority guides, 
the knowledge of many individuals, cross references, etc., can all be placed 
in a machine memory, and the system tested again and again, each time 
perfecting it a little more, so that ultimately a practical system is obtained. 

To conclude, I should like to summarize a system which I believe can 
be used to provide automatic classification and indexing. 19 It depends on 
more than titles alone. What I would add as machine readable input is as 
follows: table of contents; an index contained in the book itself (if any); 
introductory paragraphs which describe the contents of the book; a short 
indicative abstract (where available) would also be useful. Such material 
could be put into machine readable form (where it is not now available) 
by clerical personnel, following a consistent set of rules as to the elements 
of the materials to be used and in what order they are to be followed. 

This system works by means of a machine thesaurus which contains a 
vocabularly of words weighted for retrieval importance. The computer 
looks up in the thesaurus every word and contiguous word pair of each 
sentence of input. (In this sentence contiguous word pairs would be: 
"every word," "contiguous word," "word pair," etc.) The thesaurus con- 
tains cross references, and all words are grouped into synonym classes. 
The computer keeps track of the location of each word, and thus proxi- 
mity and pairing factors can be calculated. (Pairing factors refer to com- 
binations of words which make up subject concepts. For example, the 
terms nuclear, power, and propulsion when joined together as in nuclear 
power, or nuclear propulsion, mean something quite different from nu- 
clear alone, power alone, and propulsion alone.) 

Weights and pairing factors are put into the thesaurus by knowledge- 
able human beings. That is, certain words in certain subjects are more 
important than other words. This is readily recognized. Thus, in deter- 
ming subject headings, weights and pairing factors are very important, 
but they are created, for storage in the thesaurus, by humans. 

For each subject to be included in the automatic system, a separate 
thesaurus must be constructed. This is similar to creating a special sub- 
ject heading list for any given subject. As a matter of fact, subject head- 
ing lists provide a good basis for the beginning of a thesaurus. However, 
it is suggested that processing of natural language text from various sub- 
ject disciplines may well provide new terms not now in existing subject 
heading lists. 

The whole process is perfected by running it parallel with human 
classifiers and indexers through a pilot period when both systems process 
the same materials. As machine errors are discovered, it is hoped com- 
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puter programming rules can be modified and changes made in the 
thesaurus so that the next trial, using different materials, will then result 
in a better process. This can be continued for a considerable length of 
time, ultimately, one would expect, resulting in a practical machine 
system for classification and indexing. 

It is not clear, at this point, how much machine-readable input is 
necessary to produce usable automatic classification and indexing. We 
have mentioned the numerous tests and experiments which looked at 
titles alone as the potential machine input. It would seem that titles alone 
are not enough, but how much more we must have in each and every 
subject discipline which is covered by a large academic library, no one 
can say. Actually, no one can really say how much is needed in any subject 
discipline at the present time, although one has reason to believe that in 
the sciences it may be that, with carefully constructed titles, very little 
more than the titles and the table of contents would be adequate. This, of 
course, would mean that the input costs could be quite low. 

Certainly, the categorizing of printed materials is a profound matter. 
No one is suggesting seriously that all humans will be relegated to a state 
of useless bystanding. There are obviously many, many books and other 
printed materials which are exceedingly difficult to classify or index be- 
cause of their esoteric contents or the opacity of their writing. In a system 
utilizing automatic indexing or classifying, there will always be a need for 
knowledgeable humans to take the difficult materials which the machine 
is unable to deal with. Rather than making catalogers feel inferior to a 
machine, if they are given only the most difficult materials to catalog, it 
should make them feel superior since the machine process cannot do the 
job. 

How close are we now to an automatic system? Not many months ago 
a Washington, D. C, firm, Information Systems, Inc., announced the 
availability of a system of automatic cataloging and classification em- 
ploying the IBM 1410 Computer. I have seen a sample output of this 
process, and it appears quite usable. 

If we are truly interested in better and better library systems to serve 
our varied clientele more usefully, and if a machine system of cataloging, 
classification, and indexing can make a library more usable, then no one 
should oppose it. If a machine system cannot do this, then it should be 
discarded. At this point we do not know which will prove superior, but 
we ought not to base our decision solely on obvious cost factors. 

I suspect that there are some who will label me a misanthrope. I shall 
deny that, although I must confess that there are certain properties of 
machines which I find appealing. Machines believe everything one tells 
them, literally. They never misinterpret one's meaning. They rarely 
gossip, nor do they impute sinister motives to one's every action or word. 
Let me hasten to add that I do not wish to imply that humans do those 
things either. But they possess the means to do so, and machines do not. 
I suppose that the moral of this tale is that, if you have any choice in the 

Volume 9, Number 1, Winter 1965 . 51 . 



matter, you should choose humans for your friends and machines 
your enemies. 
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Computerized Serial Records 
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Fort Collins, Colorado 



I 



SPENT OVER a week looking for a story concerning a humorous 
aspect of serials, but I finally came to the conclusion that serials 
aren't funny. I think most librarians will agree with me. 

A friend of mine who sells computers was once talking of the "good 
old days" in his sales career when he sold encyclopedias door to door. He 
then made a sale, delivered the set of books, pocketed his commission, 
and never saw the customer again. Now, as a computer salesman, he had 
to work with the customer's employees after the sale so they would be 
trained before the machine was installed. He had to see that they got off 
on the right foot after delivery time. He had to keep training new custom- 
ers' employees. He had to keep soft-ware up-to-date even on out-of-date 
computers. He said that acquiring a customer now meant that he would 
become that customer's nursemaid for life. 

A similar situation exists in libraries in considering monographs and 
serials. A monograph is a one shot deal. Someone wants a particular item, 
we identify it, put it on order, catalog it after it arrives, and then, as far as 
technical processes is concerned, forget it. 

Serials, of course, are like the computer. Once a serial title is on order, 
we become its slave. Even at the time of initial order we have to decide 
whether to start with the current issue or the first issue of the current 
volume. Do we want backfiles at once? Later? Do we want microfilm or 
the original paper? Once the thing arrives, then binding rears its ugly 
head. Every year subscription renewals have to be made. Changes in title 
and in frequency, mergers and splits, non-arrivals and subsequent claims 
have to be watched for like a hawk looking for his lunch. 
Well, you know all the problems. Serials are just a mess! 
It is precisely because of these problems that some sort of systema- 
tized solution has been sought for serials. In Joe Becker's recent article 
in the ALA Bulletin* he briefly covers the history of serials record keep- 
ing. Except for a handful of libraries and a lot of wishful thinkers, serials 
record keeping stopped at the Kardex or Acme type file in either a cen- 
tralized or non-centralized system. The usefulness of these files is usually 
dictated by their size. A small file of, say, less than 1,000 titles, usually has 
all the information about each of the titles located on one record. Check- 

* Becker, Joseph. "Data Processing Equipment in Libraries; Automating the Serial 
Record." ALA Bulletin, 58:557-560. June, 1964. 
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ing-in. binding, finances, subscriptions, holdings, claims, etc., are in one 
place. The amount of Roor space this record covers is less than that 
covered by a desk, and one person can easily maintain the file. 

As the file grows in number ol titles, it usually grows disproportion- 
ately in bulk, too. The choice is either to put fewer items into a tray at a 
time (when more items per tray are really necessary) or to take some oi 
the information out of the file and store it in a second (or third or fourth) 
file. The file is now decentralized, more people can work at the records, 
and material in the file is not damaged by overcrowding. The file, of 
course, takes up more floor space, more physical movement is required in 
using the main part of the file, more knowledge and experience is needed 
in using all parts of die file, and certain pieces of information have to be 
expensively repeated nn each of the different records. This, then, is the 
situation in which most of us find ourselves. We are sure things will get 
worse before they get better. 

It is too bad that library solutions to serials problems have to come 
after the problem is created. The several committees of the professional 
organizations have suggested ways in which publishers could cease creat- 
ing problems, but so far, suggested plans for standard foimat and loca- 
tion of bibliographic data on a serial piece, volume and piece numbering, 
size and thickness of a single piece, and other quantitative measurements 
of a periodical subject to variation and of interest to librarians have been 
ignored by pubtishers even in the library field. I feel that persons as in- 
ch vidualisdc as publishers, their layout men, and artists will probably 
never cooperate in a significant number unless standards were to be tied 
into their second and third class mailing privilege. I'm not sure that I'd 
recommend anv measure this strong. 

Let us keep in the back of our minds. though, the fact that problems 
with serials are created by the publishers, but for now we will have to find 
solutions to these problems within the library. 

Serials seem to be deceptively simple to automate when, in fact, they 
are more complex than any other phase of library technical services. It is 
the definition of serials which gives us this sense of false security. W e 
might, for example, define a serial as an item issued in parts, one or more 
of which must arrive each year. The definitions of what constitutes a 
serial are almost as varied as the number of librarians making the defini- 
tions However, regularity of arrival (or at least an assumed regularity), 
is the basis for nearlv all such definitions. It is this assumed regularity 
which presents the problems. We can illustrate this very nicely if we make 
a list of all the arrival frequencies possible with the serial tides winch 
come under our definition. First, we have once per year in any of the 
twelve mondis, then twice per year with the first one arriving in any ol 
the first six months and the second piece arriving six months later. Then 
an item arriving three times per year with the pieces coming in any 
month, then four times per year with arrival in any of die twelve months 
with the added complication of their sometimes being dated by the month 
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and sometimes by the season. We can carry this through to the newspaper 
which is issued everyday of the year without fail. Assumed regularity 
means actually any possible combination of months, weeks, and days. 
Some kind of a formula can now be set up so that pecularities of any 
serial arrival can be represented with a short understandable notation. 

This notation of frequency would not do a clerk much good, for a 
quick glance at the check-in record or possibly even the piece itself would 
tell her as much in less time. However, this short notation can be punched 
into paper tape or into a tabulating card with other similar notations 
concerning payment, subscription, binding, holdings, and claiming pro- 
cedures. These notations can then be sorted and compared in many dif- 
ferent ways, so that one particular serial with certain attributes or all 
serials with common characteristics can be noted. 

One very interesting application of this nature is being used at the 
University of Illinois, Chicago Undergraduate Division Library. While 
it is a non-computerized application, I would like to describe it briefly be- 
cause I think that any library which had access to a card punch and a 
sorter could benefit from it. 

The original purpose of this file was to gather information about the 
serial titles that was not listed in the Serials Record. As the record was not 
a centralized one, although it was called a Central Serials Record, it was 
felt that some kind of intermediate record was necessary. The intermedi- 
ate record was originally organized around one punched card for each 
title. An abbreviated title was selected using a small set of abbreviations 
limited to 35 characters. An eight-character sequence number was also 
assigned. All titles could then be put into alphabetical order by sorting 
on this number only. 

One column was devoted to method of acquisition. A one code 
punched in this column meant that the title came as the result of a mem- 
bership. A two meant subscription, a three was standing order, and so 
forth. Two columns were devoted to source of acquisition, and one 
column each was devoted to the month in which the volume number 
changed, the month the index was published, the month binding was 
picked up, the year the item was bound, the method of index arrival, the 
shelving location, the type of binding, and a code denoting dead titles or 
items not checked in. 

The file was used as follows: everything that was known about a 
serial title was assumed to be in the serials record. All this information 
was punched into the card. If it turned out that the month the index was 
published was not in the record, then the column was left blank. Later 
the whole file was sorted on this particular column. The items which had 
been left blank would drop out in the reject pocket of the sorter. These 
cards would then be taken to the shelves and checked against the actual 
periodicals or they could be checked against Ulrich's or other serial 
bibliographic sources. In this way the information in the serial record 
was completed. 
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Louis Schultheiss, who thought up this whole idea in the first place, 
now realized that he had all kinds of valuable information captured in a 
machine-readable form. Why not get some immediate use out of it? The 
file could be run through the sorter, and the cards for the serials which 
have indexes which should have arrived during the month dropped out. 
A quick check against the receipt file or the file drawer where the indexes 
are actually stored would show which ones had arrived and which ones 
should be claimed. 

Or the file could be sorted on the column denoting month to be sent 
to the bindery. Only the items to be sent this month would be dropped 
out with the rest of the file left in order. The cards for the items to be sent 
during the month can then be sorted again on the column showing 
shelving location. It would be simple then to hand a clerk a bundle of 
cards and tell him to go to the Reference Department and take the last 
volume off the shelf and prepare it for the bindery. He could do the 
same with the bundles of cards for serials shelved in all other locations in 
the library. 

These are but two examples of how useful a punched card serials file 
would be to any sized library. Every librarian should be able to think up 
many other applications.* 

Now, why this long-winded description of a non-computerized serials 
record when this talk has been billed as one dealing with computerized 
serials records? First, the type of work involved prior to the application is 
the same in both cases. That for the computerized record has greater 
scope, and it must be absolutely complete before any of it is useful. 
Second, the type of applications are the same, although those which are 
computerized can be of much greater variety and of infinitely greater so- 
phistication. Third, the goals are essentially the same: better service to 
library users for less cost. 

Let us take a look at three typical computer applications to serials 
problems of libraries. The first is a holdings list. These lists have become 
very popular as serials records have become both more complex and less 
accessible to the public. They can range all the way from very restricted 
one line per title lists to complete ones which have as much information 
for each item as a library catalog entry. Once the decision to make such a 
list as this has been made, the scope of the list has to be decided. I believe 
that this is the most difficult part of the whole procedure. The librarian's 
definition of what constitutes a serial must be brought out and re-argued. 
Should duplicate copies be listed or only the most complete one? What 
about government document serials? Are continuations fair game for the 
list? Will dead titles be included, or is this for live ones only? Is a distinc- 
tion going to be made between a serial which died and one which the 
library merely stopped subscribing to? In other words, where you draw 
the line is up to you. 

* Mr. Schultheiss has prepared a detailed account of this process as well as its com- 
puterized sequel. This paper will be published in a forthcoming issue of LRTS. 
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Once the scope of the list has been decided to everyone's satisfaction, 
then the form of the individual parts of the list can be made. Is the title 
going to be complete in every case or abbreviated in some cases? If it is 
to be abbreviated, then how short should the abbreviation be? The li- 
brary of the University of California at San Diego limits its short title to 
28 spaces. The University of Illinois at Urbana limits that one to 569 
characters or complete title for all practical purposes. The eventual in- 
tention is to use a computer to do their abbreviations for them. The Uni- 
versity of Illinois in Chicago and Florida Atlantic University use the belt 
and suspenders philosophy and have both. The short title is limited lo 35 
spaces, and the full title will be complete in both cases. Whether or not 
call numbers should be included is another possible bone of contention. 
It depends on the use of the list, of course. 

Lists intended primarily for the use of patrons definitely should have 
call numbers so that a double table look-up is avoided. How extensive 
should cross-references be? Again, it depends on the use. (I would tend 
to be extra generous.) Finally, what will the holdings statements look 
like? How complete will they be? More blood has probably been let over 
this problem than any other. Let's assume that both volumes and years 
will be included. Should parts be listed if the volume is incomplete, or 
should the volume number be put in brackets or curves to show an in- 
complete volume? Erring on the side of completeness is best. If material is 
handled by electronic computer, the computer can take away what is not 
needed and store it for later use, but it cannot add what is not there. 

Finally, the physical format of the list can come in for limited dis- 
cussion. If the computer has a printer with only upper case and a limited 
number of special characters available, this is what will be used. If lower 
case is available, then the list will show greater variety and be easier to 
read. Some computers have a special feature called "Space Suppress." 
This, in effect, causes the computer to print certain designated lines two 
times, one on top of the other. Because of the vibration of the paper, the 
result looks like boldface. Argonne National Laboratory has a serials list 
which makes very effective use of the Space Suppress feature. Most im- 
portant, though, is capturing all that can be captured (lower case, if 
possible), using what can be used, and saving the rest for future use. 

The computerized serials list which is the most attractive of all is that 
of the Massachusetts Institute of Technology. It is produced using a 
large scale computer which produces paper tape which is in turn fed into 
a Photon photographic typesetter. Capitalization is handled through the 
computer program as the initial input is in only one case. 

The second type of computer application is that dealing with a single 
phase of serials handling. Checking-in, subscription ordering, bindery 
routines, routing, and so forth can be tackled as a discrete problem'. 
Several libraries are approaching machine handling of serials in this 
manner. Naturally, librarians want to approach the problems first which 
irritate them most. And with certain libraries there are only one or two 
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thorny problems. The rest of the serials routines would be so simple that 
it would be useless to waste computer time on their operation— the man- 
ual method is able to cope quite well. Unless a library is in this type of a 
1 josh ion, I (Im not teiommeud thai ;i spot approach be made. 

The third type of serials application and the one which will become 
most popular with libraries of any pretention to size is the total system 
approach. It would be assumed that other phases of library mechaniza- 
tion besides serials would eventually be added or would be in die plan- 
ning stage at the same time as serials. Except for financial routines, serials 
can be separated completely from other aspects of library automation. 
Briefly, the procedure to be followed in a total approach is, first, to make 
a complete study of the present system. With this background, the de- 
mands of a future system can be stated without equivocation. A list must 
be made showing every piece of information which has to be put into a 
system in order to achieve the desired results. In order to get this, the 
layout of the resulting documents must be known down to the last period. 
Now the computer programmer can be turned loose on the problem. It is 
up to him to take the data he was told he would have and turn out the 
documents which are needed. 

Once the computer program is written and tested, the full file of 
serials information must be converted into machine-readable form. Hav- 
ing gone this far, most librarians will probably go the rest of the way, but 
the conversion will be the most expensive and time-consuming part. It 
will also be the largest source of errors and will plague the system for 
months. I would not advise discarding the manual methods too soon; 
both the new and the old systems should be operated in parallel for some 
length of time, at least until the operators are convinced that the bugs 
are out of the new system altogether. 

Now the public can be weaned away from the old, barely adequate 
system to the new one with its infinite expansion and perpetually-avail- 
able information. 



REGIONAL GROUPS 
Only two recent meetings have been reported in time for inclusion in this 
issue. 

The Technical Services Section of the Wisconsin Library Association heard 
Pauline A. Seely (Denver Public Library) discuss "ALA Filing Rules— New and 
Revised." Elizabeth Rodell, RTSD Executive Secretary, then spoke on "The 
World of Technical Services." 

The New England Technical Services Librarians at their business meeting 
voted to become a section of the New England Library Association. The group 
retains its present name. — Doris Ransom, Chairman, Council of Regional Groups 
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Computerized Circulation Work: A Case 
Study of the 357 Data Collection System 



Ralph E. McCoy 
Director of Libraries 
Southern Illinois University 
Carbondale, III. 

MY ASSIGNMENT is to present a case study of the automated book 
circulation system now in operation at Southern Illinois Univer- 
sity, Carbondale. After almost three years of planning and experimenting, 
the system went into operation on a limited scale this spring. By means of 
a step-by-step conversion from manual to automatic operations, we have 
been able to test the equipment under actual operating conditions, to im- 
prove upon certain routines, and to adjust staff assignments with a mini- 
mum of difficulty. We expect the circulation process to be completely 
automated when school starts in the fall. 

Before describing the system, let me give you a picture of the circula- 
tion problems that led to our efforts at automation. 

Southern Illinois University has an enrollment of some 20,000 stu- 
dents, 14,000 of whom are on the Carbondale campus. With the excep- 
tion of four small specialized libraries, all books on the Carbondale 
campus are housed, and library services are provided, in a central build- 
ing—Morris Library. There are approximately 600,000 volumes in the 
central collection, arranged in four subject divisional libraries and classi- 
fied according to Dewey. Except for the Rare Book Collection, all books 
are on open shelves; and a central circulation point on the main floor 
serves the entire building. The concentration of book circulation at one 
point was one of the factors that made automation feasible. 

When we began the study of our circulation system in the fall of 1961, 
book circulation from the central library had reached a peak of a thou- 
sand volumes a day, and the annual rate of increase for some years had 
been in excess of the rate of increase in enrollment. The circulation sys- 
tem we had been using effectively for many years was a combination of 
McBee Key-Sort and Gaylord electric charging. The increased volume of 
books circulated in recent years had created a burden that the system was 
unable to support. For one thing, the congestion around the files, i.e. the 
filing of cards, the needling for overdue books, the answering of inquiries 
about books charged out, the library clearance of seniors, and the removal 
of cards in the discharging of books, forced us to move the files from back 
of the loan desk to another area. Separating the files from service contacts 
created problems of communication and required additional staff. When 
the situation became critical in the Fall of 1961, we turned for assistance 
to the University's Office of Systems and Procedures. 
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I am certain that our immediate problems could have been met by a 
solution less drastic than complete automation, but the existence of a 
data processing unit on our campus interested in extending its services, 
and the fact that we were already using IBM facilities for book ordering 
and might make further use of automation in other library routines, 
prompted us to consider automation of book circulation. Furthermore, 
there was the prospect of correlating statistics on book use with student 
grades, test scores, and other data that would eventually be fed into a 
campus-wide data collection system. 

We were fortunate in having assigned to our library project a young 
graduate student in management, L. R. Dejarnett, who not only con- 
ducted the original study, which he incorporated in a master's thesis, but 
who stayed on the University staff to design the system and see it into 
operation. 

The study of the circulation system then existing consisted of gather- 
ing statistics and observing patterns of use, preparing flow charts, and, 
where feasible, conducting time and motion studies of specific operations. 
In all of these activities Mr. Dejarnett, under the direction of the Head 
of the Office of Systems and Procedures, R. D. Isbell, worked closely with 
members of the library staff and with the staff of the Data Processing and 
Computing Center which would eventually be involved in the operation 
of the system. 

In addition to the congestion around the files and the pile-up of books 
waiting to be discharged, the study revealed a serious problem in the 
handling of "snags," i.e., books which could not be discharged immedi- 
ately because no charge card could be found. We know snags can be the 
result of a number of possible errors — incorrect copying of the call num- 
ber on the McBee card (an error not detected by the circulation clerk at 
the time of charging), an error in filing the card (sometimes the result of 
difficult handwriting), or an error in withdrawing the card prematurely. 
When we consider that some forty student assistants, as well as six full- 
time clerks, serve at the loan desk during the one hundred hours a week 
the library is open, we know that the chance of error is magnified. 
Another trouble spot was the whole complex of pulling cards for overdue 
books, typing and mailing of notices, and recording payment of fines. One 
week's work on overdue books was hardly finished in time to start the 
process over again. 

In addition to studying the existing system, Mr. Isbell and Mr. De- 
jarnett visited libraries where automated circulation procedures were be- 
ing used and industries where automated inventory controls seemed to 
have some bearing on the library problem. They also examined numer- 
ous pieces of equipment which might be used in creating an ideal system 
and sought the advice of experts in the computer industry, particularly 
the IBM people, whose equipment we decided to use. The system which 
was ultimately divised, while it borrows ideas from several existing in- 
stallations, goes beyond those known to us in the extent of its automation 
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and in the completeness of the controls which it provides. For example, 
one of our requirements which complicated the design was that the sys- 
tem should permit making requested books available to readers as 
promptly as possible through the device of "reserve" and "recall." The 
circulation study made by George Fry and Associates noted that this re- 
quirement distinguished the need of the university library from that of 
the public library. The use of a "transaction number," common to most 
automated circulation systems, would not provide positive identification 
of an item in circulation. 

Although the problems encountered in the design of the system were 
numerous and sometimes complex, the system that evolved was basically 
very simple. Three items of information are assembled in charging out a 
book — the borrower's identification number, the call number of the book, 
and the date due. Together, these items identify each transaction. They 
are fed into an IBM 357 data collecting unit and form the basis for the 
tape storage of circulation records. 

Let me describe the process of charging and discharging books. The 
borrower presents his book at the circulation desk, together with a plastic 
identification card. The identification card contains the borrower's pic- 
ture and his name, address, and identification number, embossed after the 
fashion of a gasoline credit card. His identification number has also been 
punched on the card in machine language. This ID card, in use on the 
campus for several years, serves many purposes besides library borrowing. 
It is issued to students by the Student Affairs Office at the time of regis- 
tration. Faculty cards, issued by the Personnel Office, use the social secur- 
ity number for identification. 

Each library book contains in its pocket a master IBM card bearing the 
call number of the book, expressed both in printed and machine lan- 
guage. The coding was devised so as to provide all of the necessary data 
on a single line of printing and still fit on a shortened IBM card, utilizing 
only 4% inches of the standard 73^-inch IBM card. This enables the 
master book card to be folded when it is inserted into the pocket of a book 
of less than octavo size. The call number line is divided into six units for 
convenient reading — the Dewey number, the Cutter number, year, 
volume, part, and copy. Twenty-five units were provided for the call 
number, with an additional 16 units to accommodate the borrower's 
number and due date, a total of 41 units. I shall discuss later how these 
cards were prepared for a half-million volumes. 

The book card and the borrower's card are collected by the circula- 
tion clerk who depresses the due date (the third item of information in 
the transaction) on a manual entry keyboard (IBM 372) which is at- 
tached to the basic data collection unit (IBM 357). Due dates will vary, 
of course, depending upon the nature of the book and the status of the 
borrower. The book card and borrower's card are inserted into two slots 
in the 357 unit. This immediately activates a printing key punch (IBM 
026) which is located adjacent to the 357 input unit. The key punch pro- 
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duces two cards, each bearing the three items of information both in 
printed and machine language. One of these, a pink card, is returned 
with the master book card to the pocket of the book and represents the 
borrower's notice of date due. In addition, this card serves as a valid 
charge record at the exit check points. The other card (yellow) is stacked 
in the machine as the Library's record of the transaction. The identifica- 
tion card is returned to the borrower along with the book, and the clerk 
is ready to handle another transaction. The entire transaction takes about 
twenty seconds, only eight of which is machine time. 

The IBM 357 system consists of two 357 input stations, back to back, 
each equipped with a 372 manual entry keyboard for recording date due. 
These are joined by a single 358 control unit (the "little black box") and 
a single 026 key punch which serves two input stations. The control unit 
not only regulates the timing of the operation, but provides a read-back 
circuitry that performs an audit of the printing and punching. The sys- 
tem is mounted on a movable dolly. Three such systems with a total of 
six input stations constitute the final configuration of equipment for 
automated book charging in Morris Library. With the exception of a 
slight alteration of the throat of the 357's to accept embossed cards, no 
modification of standard hardware was required. 

When a book is returned, the clerk needs only to remove the date due 
card, verify that it is for that book, and send the book to the shelves. At 
the end of each day (11:00 p.m.) the accumulated transaction cards and 
return cards are taken to the Data Processing and Computing Center 
where they are loaded into an IBM 1401 computer unit to up-date the 
circulation file on magnetic tape. By 9:00 the following morning a con- 
solidated circulation list, printed in call number order, but giving bor- 
rower number and date due for each transaction, is delivered to the 
library. The computer program arranges the print-out by divisional 
library so that a copy of that portion of the book charges is supplied to the 
division for public use. A borrower who fails to find the book he wants on 
the shelf may check the charge record posted in the divisional library to 
learn when the book is due to be returned. 

Overdue book notices are prepared and addressed daily by the com- 
puter, using the circulation record tape, together with a tape file of the 
names, and addresses of borrowers. This latter file, incidentally, is the 
basic "tie-in" to the total systems concept, and can relate library usage to 
such factors as grades, age, housing, etc. Fines for books which are over- 
due are assessed automatically according to a schedule provided in the 
programming and from two items of information — borrower status and 
date due — which were supplied in the original transaction. 

I have described the basic process. There are a number of refinements, 
exceptions, and limitations. We have had to make some compromises in 
the interest of mass processing, but we have not had to adapt ourselves, 
as Lewis Mumford has feared, "with pathetic docility to the limitation 
of the machine." For example, we have recognized the need to provide 
some means of charging a book manually when the borrower does not 
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have his ID card. While we expect to be strict in requiring the use of the 
ID card once the system is in full operation, some exceptions will be 
made, particularly with faculty. We are determined that the use of the 
machine will be tempered with human understanding, despite protests 
from our systems people that even a dean should be turned away if he 
leaves his ID card in his other pants. The fact that the manual process is 
more troublesome for the borrower as well as for the library should en- 
courage the use of the ID card. 

We also have to make a manual transaction when the master book 
card is missing or, in the case of a book returned, the date due card is 
missing. (These IBM cards do make handy book marks.) Manual charges 
must also be made for certain categories of library material for which no 
master cards were made. This includes a collection of some 80,000 maps 
which are seldom circulated outside the building and which would have 
been difficult to process. We also excluded those government documents 
classified according to the Superintendent of Documents scheme; after 
considerable study, our analysts gave up as hopeless the task of fitting the 
notations of this classification scheme on a single IBM card. Bound 
volumes of journals are also not provided with master cards because they 
are not classified and because there is a limitation on out-of-building use. 

When there is no master book card, the book is charged out manually 
on a so-called Universal Charge form, using the embossed portion of the 
ID card in a manner similar to that employed at gasoline stations with 
the credit card and with much the same equipment. The call number or 
other identification of the volume is filled in by hand. The Universal 
Charge form provides three copies — one copy goes into the book pocket as 
a date due reminder, a second copy goes to the Data Processing and Com- 
puting Center for preparation of a master card and return card, and a 
third copy is retained by the Library as a temporary record, pending the 
receipt of the permanent card. The same set of forms is used for charging 
books without an ID card, the name and address of borrower being en- 
tered by hand. We have estimated that the number of manual transac- 
tions after the system is in full operation will be considerably less than 
one per cent of the total. Two-hour reserve books, because of the fre- 
quency of turnover, are excluded from the machine operation. 

Specially designed "courtesy cards" which can be accepted by the IBM 
357 unit will be issued on a term basis to guests, visiting scholars, short- 
term students, faculty wives, and to record removal of books for such 
internal processes as binding, recataloging, etc. 

The two most frequent questions asked by visitors about our auto- 
mated circulation system are: How long has it taken you to put the system 
into operation? What is the cost? 

As to time, the design of the system and the development of the equip- 
ment took about a year and a half; the coding of the master book cards 
and the insertion of the cards into a half-million books has taken another 
year and a half. There was some overlap in the two phases. 

The total cost of the project, excluding the salaries of the staff mem- 
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bers who worked on the design, was approximately $40,000. This included 
the purchase of all equipment in the three systems. The operating cost 
of machine time at the Data Processing and Computing Center is not 
charged to Lhe Library nor is the cost of student and faculty identification 
cards. The Library pays for the IBM card stock, courtesy cards, and the 
expense entailed in mailing overdue notices. Because of the division of 
operating expense between the Library and the Computing Center, it 
would be difficult to establish unit cost figures. One lull-time file clerk and 
the services of a number of student assistants have been eliminated. On 
the other hand, additional clerical staff is required at the Computing 
Center. There will also be the expense of machine maintenance, this is a 
cost factor often overlooked in automation. Accuracy, efficiency, and the 
potential for handling increased loads, rather than a significant dollar 
savings, are the major advantages of the automated system. 

The most time-consuming and costly phase of converting to the new 
system was the coding and pocketing of the master book cards. This pro- 
cess was greatly simplified, however, when the Systems and Procedures 
Office, in cooperation with Science Research Associates, developed a pencil 
code sheet which could be optically scanned, thus obviating direct key 
punching. Working from the Library's shelf list, trained student assistants 
transferred the call numbers to code sheets, eight volumes to a single sheet, 
i.e., four on either side. The form allowed for coding multiple editions, 
volumes, or copies of the same book by using only one coding entry. The 
coded sheets were converted to a magnetic tape record by the Docu- 
Tran. The magnetic tapes, in turn, were processed on the IBM 1401 to 
create the master cards. A print-out of these punched cards was matched 
with the shelf list to discover any errors in coding. Master cards for cur- 
rent acquisitions are manually punched from one of the multiple order 
slips which is sent to lhe Compuliug Center as soon as a call number has 
been assigned to the book by the cataloger. 

There may be interest in the procedure for making the identification 
cards. When a student is enrolled in the University, his picture is taken 
by a special camera furnished by the Photo Identification Company of 
Chicago. The film is sent to Chicago for development and the prepara- 
tion of a laminated card. The card is returned to our Computing Center 
where it is punched, then forwarded to the campus Photographic Service 
for embossing. Each card is proofread by running it through a 357 unit 
and comparing lhe print-out with the embossing. All this takes a week to 
ten days. The card is good for a student's entire college career. It is up- 
dated to show payment of fees for the current quarter by means of a 
certificate of registration. The identification card and the current certifi- 
cate are issued in a two-window plastic envelope suitable for carrying 
in one's billfold. 

During the course of the recent trial period we have run into numer- 
ous problems, most of which have been of a minor nature and were 
readily solved. There remain only two unsolved problems of any conse- 
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quence; one deals with equipment, the other with procedure. We have 
not yet found a satisfactory gadget for manual charging with the em- 
bossed plate, but we hope to have this settled by fall.* The other problem 
is devising a simple but effective method for flagging books for which 
there are personal reserves. Thus far, each proposal made by our systems 
analyst has seemed to us to be impractical. Ultimately, this matter is 
likely to be solved when the University has the equipment for random 
access storage. At such time the book circulation record could be trans- 
mitted directly from the input station to central storage equipment 
without benefit of punched card. Conversely, a book could be discharged 
simply by inserting the master card from the returned book into a dis- 
charge unit which would remove the charge from the central data 
storage and indicate when there was a reserve for the book. In fact the 
same impulse could generate the printing of a notification form to be 
mailed to the potential borrower. Furthermore, random access storage 
would permit a remote control station in the library to furnish immedi- 
ate information to borrowers on the availabliity of a book, thus elimi- 
nating the need for a print-out or the circulation list. These advances 
could be carried out, I am informed, as modifications to our present 
system, once the central storage equipment is installed. 

We expect to establish the automated circulation system on the Ed- 
wardsville campus when the new library building is completed a year 
from this Fall. 

I can perhaps sum up the advantages of our automated circulation 
system in this way: (1) it reduces the chance of error in making out 
charge cards, in filing them, and in discharging books; (2) it saves the 
time of the borrower who is relieved of filling out a card for every book, 
(3) it reduces the bottleneck in discharging hooks and getting diem re- 
turned to the shelves, (4) it simplifies the routines of sending overdue 
notices, (5) it simplifies the library clearance of seniors, (6) it lends itself 
10 greatly-increased circulation loads without strain, and (7) it faciiitates 
the compilation of useful statistical and analytical data. And I might add 
one final advantage. It takes the pressure off the library to pursue im- 
practical schemes in the field of automation, projects that are unsuitable 
and premature although admittedly fashionable. 

In the process of mechanizing circulation routines, both librarians 
and systems people on our campus have benefited. We have learned to 
have respect for the ability of the machines to solve our problems, and— 1 
the computer people have, to their surprise, learned that what appeared 
at first to be a fairly simple problem, turned out to be highly complex — 1 
As Burton W. Adkinson indicated at a recent conference at the Uni- 
versity of Illinois, the gap between the computer people and the librarian 
is closing as the two work together in solving library problems. In our 
case this gap was bridged effectively by the services of a systems analyst. 

* We have since adopted the electrically operated Dashew Datawriter 253. 
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ATA PROCESSING SYSTEMS for purchasing, distribution, and 
. inventory control have been used successfully by commercial firms 
for many years, but only a few librarians have as yet adapted the same 
techniques to solve their own acqusition problems. Many librarians, if not 
most of them, seem to feel thai the purchasing and processing of books, as 
an over-all operation, is completely and uniquely different from the pur- 
chasing and processing of other items, such as machine parts or tomato 
sauce. To some degree, of course, they are different, and it would be both 
silly and disasterous to ignore those differences. The physical description 
of a title being ordered for the library is usually longer and much more 
complicated than the description required to buy five cases of catsup, and 
the library order also differs in that most titles are purchased as single 
units that need not be re-ordered because they do not ordinarily go out of 
stock. But these are superficial differences and are a matter of degree 
rather than kind. Nearly all of the requirements of the business system 
(other than the requirement to show a profit) exist in the library as well: 
items are described and ordered, bills are paid and financial reports are 
provided to management, items are received, checked-in, arranged for 
use, and inventoried at periodic intervals. It seems reasonable to believe 
that carefully-designed data processing systems can be extremely useful 
in carrying out these similar operations in large libraries. 

One of the great advantages of a data processing system is that data 
prepared in one format for one operation need not be retyped to produce 
the document or format required for another operation. Corrections and 
additions can be made without disturbing other data already in the sys- 
tem, and new documents produced near the end of a cycle can be, to a 
large extent, by-products of earlier operations. As a result, many repeti- 
tive clerical routines can be eliminated, and the amount of revision re- 
quired to ensure accuracy can be greatly reduced. Members of the pro- 
fessional staff are free to spend more time on truly professional duties. 

Several portions of an integrated data processing system designed by 
the University of Illinois Library, Chicago, and the General Electric Com- 
pany under a giant from the Council on Library Resources have now 
been developed in detail and are in the process of being tested under a 
further grant from the National Science Foundation. 

Since the primary objective of the University of Illinois system is tire 
production of book catalogs and printed serials and circulation lists that 
can be used in various parts of the library or campus, it became apparent 
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very early in the systems study that a computer, rather than unit-record 
equipment, would be required to provide the necessary printing capa- 
bility. The acquisitions program described in this paper is designed for 
use with an IBM 14m computer with 8000 positions of memory, four 
tape drives, and all available advanced programming features. It is not 
mandatory, however, that the library itself have such a computer, or even 
that the university have one if arrangements can be made to purchase 
time from another company or from a data center. 

Before going into any discussion of coding, it might be well to indi- 
cate how the system assists the staff in carrying out the order function 
of the library. 

At the beginning of the order cycle, bibliographic data in the form of 
Library of Congress catalog copy plus necessary control codes and order 
information are introduced to the system on punched cards and converted 
to magnetic tape. Whenever possible, these bibliographic data consist of 
the complete LC entry, including the LC call number. A series of com- 
puter programs then manipulates the tape record, extracting pertinent 
sections and arranging it in appropriate formats to produce purchase 
orders, in-process lists, financial records, book cards, and catalogs. The tape 
record is added to and modified as necessary, primarily by means of pre- 
punched cards produced automatically as part of the order operation. 

During this past year, the Library decided to improve the efficiency of 
its technical operation by eliminating all local exceptions to Library of 
Congress cataloging and by changing from the Dewey to the LC classifica- 
tion. This decision, together with the requirements of the machine opera- 
tion, will shift considerably the workload within the technical services area, 
and will move many of the duties traditionally associated with cataloging 
into the area of order verification. The duties themselves do not become 
less important, but will take place at a different time and will serve a 
somewhat different purpose than before. 

As was the case with the manual system, the order process begins with 
the delivery of approved order requests to the order section for searching 
and verification. For the time being, at least, both searching and verifica- 
tion will remain manual operations. An attempt is made to provide a 
complete LC entry, either by adding to the information on the order re- 
quest or by providing a proof slip. The bibliographic and order ele- 
ments are coded for machine identification, and the order is passed on to 
the Data Processing Division for key-punching and processing. 

The order prepared for the vendor consists of computer-printed 
3X5 sli P s (see Figure 1), which are returned to the order section with 
a deck of cards to be used for check-in when the material arrives, and a 
deck of financial cards that have already been used to encumber funds 
and will be used again to produce vouchers when payment is finally au- 
thorized. The order slips are sorted by vendor number, attached to cover- 
ing order letters, and mailed. The card decks are placed in tub files and 
held for later use. 
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Name of library 





Library account number 






Sequence number for title 






Vendor number 



UNIV OF ILL CCC U-6170 WYTRJAAP1403 105075 
* 



WYTRWAL, JOSEPH ANTHONY, 1924- 

AM E R ICAS PO LISH H E R I TAG E , A S O CIAL H1ST0 

RY OF THE POLES IN AMERICA. 

DE T ROI T , MI CHIGAN, 

ENDURANCE PRESS, 



ALOC 3930 01 COPIES $6.50 



I 

Departmental allocation 



Number of copies 

Estimated list price 

EXAMPLE OF COMPUTER PRODUCED ORDER SLIP 
(Figure i) 

In addition to the order slips, check-in cards, and financial cards, Data 
Processing produces a new Process Information List, or PIL, every time 
orders are prepared. This list has two major [unctions: it serves as an on- 
order and in-process file, and as a supplement to the catalog. Present 
plans are to produce new orders and a revised copy of the PIL once each 
week, although rush orders are produced manually and added to the 
system as part of the next order cycle (see Figure 2). 

Incoming materials and notices pertaining to orders are checked 
against the PIL. If the item has arrived in acceptable condition, the 
receipt card is pulled and placed in a new receipts file; if it arrives 
damaged, the receipt card is pulled and coded 10 indicate that the mate- 
rial arrived in unacceptable condition. II the dealer cancels the order, or 
indicates that shipment will be delayed, the receipt card is pulled and 
coded to indicate these facts. Then, at the established time, the receipt 
cards removed from the on-order file during the week are gang-punched 
with the appropriate new status codes plus the current date, and are re- 
introduced to the system as part of the new order cycle. The new PIL 
then reflects not only new orders, but changes in the status of earlier 
orders as well. In addition to the corrected statement in the PIL, three 
new documents may be produced at this time: a check-in card for the 
cataloger, a book card, and a pocket label. 
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PROCESSING INFORMATION LIST 24MAY63 



19 



WHEEBJBL 1443 01 000420 140300 830C 1 ORD* 



WHEELER- B ENNETT , JOHN WHEELER, 1902- 

BkfcSI-Ll IUVSK, IHh FUKGUIILN PEACE - , - MARCH - 191 8T 
LONDON, MACMILLAN, 1938. 



wiNbitM.1443 D~I OUUl/b Ib ll lb B3CC 1 OKTJ¥ 

CATALOG 117, NUMBER 87. * 



WINGF I ELD-STRATFORD, ESM E CECIL, 1882- 

KING CHARLES AND KING PYH , 1637-1643. L U NU U N, 

HOL LIS AND CARTER, 1949. 

¥4^.06 L4/4YW — 



WINSF SC1443 01 000350 105075 8400 1 ORD* 



WINSOR, FREDERICK. 

SMALL IH1LUS HUl HfcK GUUSh. NEW YUKK, , S 1 MUN AND 
SCHUSTER, 1958. 



WR IGF VS1443 01 000 7 50 1050 7 5 8300 1 ORTTf 

MR1GHI, FRANCES. ' 

VIEWS OF SOC IETY AND MANNERS IN AMERICA. 

LAHBKl UGb, HASSACHUSfc f f S, BfcL K NAP P RESS, 

6/14/1963. 

— XJ O HN HARVARD LlBKAKY o 



WYTR JAAP1403 01 000650 105 075 393C 1 ORD* 

WYTRWA L, JOSEPH ANTHONY, 1924- 

AHLK1CAS P O LISH HERI T AGE, A S O CIAL HISTCR Y OF THE 
POLES IN AMERICA. DETROIT, MICHIGAN, ENOURANCE 

— PR ESS, 19bt ~ — — 



ZARIO C A1443 01 001690 105075 3500 1 ORD* 1 
~ 0U1-L)U< > » 

THRISM, OSCAR. 

' COM MUTATIVE ALGEBRA. BY OSCAR ZARISKI AND PIERRE 

SAMUfcL . PR I NCE T ON, NEW JERSE Y , VAN NOSTRAND, 195 8 , 

1960. 



SAMPLE PAGE OF PROCESSING INFORMATION LIST 
(Figure 2) 



The financial operation is very much like the check-in operation. The 
Financial Clerk matches incoming invoices against the PIL to see whether 
or not the item being billed has arrived and has been accepted. If so, the 
financial card is pulled from the tub file, checked for accuracy, has 
changes indicated, and is then held with the invoice. When all items on 
the invoice have been accounted for or a partial payment is authorized, 
the cards are sent to Data Processing to have the invoice number gang- 
punched into the cards plus the manual punching of changes or correc- 
tions. After the production of vouchers, the cards are merged with the 
financial cards for new orders and the payment is reflected in the new 
financial summary for the week. 

At this point the acquisitions operation per se has been completed, 
and the material is now in the hands of the cataloger. If the complete LC 
entry and call number were avialable at the beginning of the order cycle, 
formal cataloging is limited to a simple matching operation to be sure 
that the catalog entry fits the edition actually supplied. If part of the 
catalog data is lacking, or if the copy originally supplied requires mod- 
ification or additions, the cataloger fills out a code sheet to be sent to the 
Data Processing Division for key-punching and processing together 
with the cataloger's receipt card, which has now been punched to show 
that cataloging has been completed. The next issue of the PIL will show 
that cataloging is complete, and that the entry can be transferred from 
the PIL to the catalog when the next catalog supplement is printed. 

This completes the description of the order program cycle. For the 
remainder of the time, I would like to discuss the lay-out of the cards 
used to introduce data into the system and some of the decisions that had 
to be made before these lay-outs could be finally established. 

The individual punched cards are laid out in such a manner that the 
first 18 columns are used for control information, and the remaining 62 
columns for bibliographic or order data. The first 12 of the control group 
are used for a sequence number, or Luhn number, which is assigned to 
the title at the time of coding and serves as an identification number for 
the title until acquisitions and cataloging have been completed. Since this 
number is based on the author's name and the first significant words of 
the title, it is also useful as a means of arranging entries in the PIL with- 
out going into filing routines (Figure 3). 

The next three columns of the control group are used for biblio- 
graphic codes. These are required because, as part of its decision to accept 
Library of Congress copy without exceptions, the Library refused to limit 
the length of any part of any catalog entry by setting fixed fields for 
specific kinds of bibliographic data, even though such a decision would 
have simplified some of the problems of the Data Processing Division. 

The problem of using fixed length or variable length fields for catalog 
data is roughly analogous to a situation requiring the storage of successive 
shipments of different machine parts packaged in identical boxes. If the 
storekeeper does not know in advance exactly how many boxes of each 
part will be received in future shipments, he has two choices for the 
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Sequence Number 



Bibliographic Code 

I 



1 1 1 1 

4 I I I 
1111 

2 2 J 2 

3 3 3 3 
4444 

5 555 
9 9 9 9 
1111 

am 



9 9 9 9 

I 1 4 5 I 7 

OEI 90S! 



901 19 M 

It 11 i: 1} M 



4 4 

5 5 5 1 5 
99IEB 



11)111 
1191131 



10 9 9 9 9 9 9 9 9 9 9 9 9 9 1 9 9 9 9 1 9 1 9 0119 9 9 9 9 9 1 9 9 9 9 9 

'J I Ifail &8»BSnaBX3l£8M3^ 37 3IB40 41 41414443 4^««I30 51 8854 59 9^99011 eH«ft«n« 



1111111111 



1 1 



9 9 9 1 ! 9 S ! ! 1 9 9 9 9 9 9 9 
nan 



111111 
12 5 ! 2 !b 1 z 2 2 2 2 2 2 2 
3333333313333333 
44i 
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Card Count 



Bibliographic Data 
LAYOUT OF CARDS USED FOR ACQUISITIONS IMPUT 
(Figure 3) 

arrangement of his shelves: (i) he can set aside a group of shelves for 
each part, allowing for the accommodation of a maximum number of 
each kind, risking wasting shelves or the more serious problem of not 
having enough space to store some shipments; or (2) he can take as much 
space as he needs for each kind of part, letting the amount vary from 
shipment to shipment, and identifying the parts by placing labels on the 
shelves. 

The University of Illinois chose to work with variable length fields, 
and must therefore label the bibliographic shelves to identify the parts of 
the catalog entry. There are, at the present time, 25 different labels in use. 
Three of these pertain to order information, and the other 22 identify 
parts of the bibliographic data. Some of these labels will be rarely used, 
while others will be used every time; it is entirely possible that some of 
the present bibliographic groups, such as the collation and the notes, 
may be broken into smaller parts with new labels of their own. 

The portion of the card deck containing any particular type of bib- 
liographic data can be of any length, and the cards in each code group 
are kept in proper order by a two digit card count number punched into 
columns 16 and 17 of the cards. Only one kind of information is coded 
into a particular card or group; the next kind of data always begins on 
a new card with its own code designation and card count sequence. The 
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end of a bibliographical group is indicated to the computer by a work 
mark punched at Lhe end of the data. 

The system just described has worked satisfactorily in a test situation 
where impui data and the timing of update operations could be pre-de- 
termined and controlled. It accomplishes the acquisition and cataloging 
of material as parts of one continuous, integrated process rather than as 
two separate functions. The parallel operation to be started in the fall 
will determine whether or not the system is equally satisfactory in a rou- 
tine, work-a-day situation, and will attempt to establish cost figures com- 
paring the relative economies of the machine system and the manual 
system it attempts to replace. If the new system proves to be economically 
sound as well as mechanically feasible, we will hope to establish it as a 
permanent part of our library operation. 
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Dissemination of Information 



I. A. Warheit 
General Products Division 
IBM Corporation, San Jose, California 

IN DISCUSSING the dissemination of information, we can, for con- 
venience sake, consider the problem from two points of view: 

(1) The degree or span of responsibility that the librarian has in 
providing special services to his clientele and, 

(2) The degree or span of bibliographic control that the librarian 
can afford to exercise on all of the materials received by the library. 

The different libraries represent a very wide spectrum of responsibili- 
ties to the reader. Some are merely convenient storehouses with the most 
rudimentary finding tools for their collections, and the reader does all of 
his own searching. At the other extreme are libraries that try to provide 
every possible service in supplying information to their clientele. The 
services not only cover reading material, but sound recordings, films, pic- 
tures, and even hand tools of all kinds. 

The direct personal service to users has in the past been confined, pri- 
marily, to the special libraries which are responsible only to a very re- 
stricted clientele and usually in a narrow subject area. The libraries 
which have much broader responsibilities, both in clientele and subject 
matter, have not been able to provide many special and individual serv- 
ices nor have they been able, bibliographically, to control much beyond 
the hard bound book. In most instances, they could not afford to process 
fully such publications as dissertations, reports, pamphlets, map collec- 
tions, moving pictures, slides, art collections, and the like. 

With the application of the computer to bibliographic processing, it 
suddenly became possible to provide specialized, direct services to library 
users and to do library processing very rapidly and on a mass scale. 
Mechanized processing, however, did not find too much favor with pro- 
fessional librarians. It was only a few rebels and people outside the pro- 
fession who adopted the new techniques. Also these techniques were 
confined to special situations and to literary forms that were not normally 
processed by libraries, such as near-print materials, foreign and domestic 
journal articles, collections of photographs, statutes, convention papers, 
the preparation of book indexes, and so on. 

There now began a series of tests, a few superficial studies, and a very 
loud debate as to the effectiveness of mechanized subject analysis. As is 
usual in such emotional situations, much more heat than light was shed 
on the situations. Now, however, as the initial temper tantrums subside, 
as experience is built up, and the more objective viewpoints come to the 
fore, there is a growing realization that these mechanized techniques, 
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when applied with discrimination, can be extremely effective and can 
extend the ability of the librarian to provide services which had been 
beyond his economic capabilities. 

Although I am sure most of you are well acquainted with the com- 
puter-prepared dissemination tools, I would like briefly to describe them 
and the techniques for preparing them. Essentially, what the computer 
can do is recognize words, sort them, and print them. It does not know 
the meanings of these words — they are just symbols — but it can distin- 
guish their physical differences. A computer, therefore, can prepare a 
concordance, can alphabetize, and otherwise arrange lists of words. And 
by a table-lookup technique, that is by checking a dictionary, an authority 
list, a thesaurus, a cross-reference list, or whatever you want to call it, the 
computer can bring synonyms together, provide cross-references, and also 
eliminate terms that are non-informative, redundant, or otherwise un- 
wanted. With these capabilities, the computer can do what is usually re- 
ferred to as "word" indexing. It cannot do "concept" indexing. It can do 
word indexing provided, of course, enough of the words which adequately 
describe the contents of the item to be processed are available in machine- 
readable form. We, therefore, have two problems: are there enough words 
present to describe the contents of the item? are the ideas or concepts in 
the document to be indexed obvious from the implications of the text 
(even though not expressed in the available words?) For example, an 
Atomic Energy Commission document is all about the chlorination of 
uranium. It is obvious from the text that this is one of several processes 
being investigated for the separation of uranium from its ore. However, 
the machine-readable portion of the document: the title, abstract, con- 
clusions, etc., do not mention this. How is it possible, therefore, for the 
computer to provide the necessary subject tracing for Uranium-Ore bene- 
ficiation or Uranium-Separation processes? Another example: an article 
describes how a certain drug caused a serious blood deficiency. Nowhere 
in the text are the terms "toxicity" or "side effects" used. Yet to retrieve 
this document, somewhere in the subject heading must appear "toxicity", 
"side effects", or something similar. 

There are other text-processing problems, such as homonyms (lead 
and lead), order of terms (there is a difference between "A House of 
Cards" and "A Card House"), poetic titles or non-informative titles ("First 
Progress Report of the Radiation Laboratory"), and many more. These 
problems are usually solved easily by the human but are difficult, i.e. re- 
quire very elaborate machine programs, or are impossible as yet for the 
computer. 

In other words, straight word indexing cannot in every instance fully 
and properly index every document. Actually, various surveys, notably of 
legal literature and of scientific and technical articles, have shown that 
word indexing is very good. And although the computer does, on oc- 
casion, fail to index all relevant concepts, it often picks up useful terms 
that the human indexer missed, and it certianly is less prone to errors and 
is much more consistent than humans. In fact, in certain kinds of in- 
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dexing where the rules can be precisely stated, such as the development 
of linear notation for organic chemical structures, the computer does a 
much better job than the best human indexer. 

Understanding these capabilities and limitations of the computer, 
some librarians are beginning to evolve new ideas on how to apply the 
computer to many of the obvious and routine processing chores, leaving 
the cataloger or indexer to revise this mechanical indexing and add those 
special elements which are required. This is sometimes referred to as 
cataloging by exception. Details of such techniques are the subject of 
other papers at this conference; I mention this very briefly only in order 
that we might have the right perspective on computerized dissemination 
techniques. 

Since the computer can recognize words and can sort and print at very 
great speeds, H. P. Luhn reasoned that the words in the title of an article 
could be permuted and aligned in alphabetic sequence on a particular 
column or set out in the margin and printed out. In reality this is a form 
of catch title indexing and is similar to the "Schlagwort" catalogs that are 
still to be found in some of the older German university libraries. 

Mr. Luhn was very modest in his proposal. He did not present his idea 
as a substitute for cataloging, but essentially as a technique for very 
quickly preparing scannable and searchable announcement bulletins. 
Mr. Luhn called the process Keyword-in-Context or KWIC indexing. 

The KWIC programs, in addition to providing a subject approach 
through pei-muting the words of the title, also prepare author indexes, 
both personal and corporate, and, in some applications, other source in- 
formation. Various types of KWIC indexes are shown in "Figures 1-7. 

After being tested experimentally, the KWIC program was first applied 
on a large scale by Chemical Abstracts to produce Chemical Titles. This 
publication was prepared so that the chemists and other researchers could 
get information about chemical publications, weeks and even months 
before they appeared in Chemical Abstracts. In 1963 Freeman and Dyson 
reported that the average monthly issue of Chemical Titles indexed some 
2,800 articles and 5,900 authors and covered about 600 journals. The 
entire index was prepared by two keypunch operators, an editor, and a 
clerical assistant with the computer producing an average issue of 125 
pages in about 41/^ hours. 

Since that time KWIC techniques have been applied so widely and so 
many new applications are being developed constantly, it is impossible to 
list them all. A few typical examples of KWIC indexes are: library acces- 
sion lists, section indexes for procedure manuals, indexes for computer 
programs, special bibliographies, correspondence file indexes, indexes 
for papers of professional meetings, cumulative indexes for periodicals, 
indexes of statutes, indexes of technical photographs, lists of standard 
parts and manufactured products, tool lists, serials lists, and so on. 

Details about KWIC indexing are given in IBM General Information 
Manual No. E20-8091, Keyword-in-Context (KWIC) Indexing, and in 
many technical articles of which a select few are listed in the bibliography 
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accompanying this paper. Machine programs are available. The 7°9° 
program developed for the Bell Laboratories Library can be obtained 
through SHARE; two 1401, a 704, and a 1620 program are all available 
from the IBM Program Information Department in White Plains, N.Y. 
Many individuals, companies, and schools have developed their own 
special versions of KWIC, and some of these programs will also be sup- 
plied to requesters. 

The cost of producing a KWIC listing will vary greatly, depending on 
local conditions, availability of equipment, and salaries. In one instance 
which may be considered typical, costs were broken down as follows: a 
monthly index covering 100 journals for a total of 2,000 articles, averag- 
ing 4I/2 punchcards per article generated 9,000 cards. These were pro- 
cessed on an IBM 1401 in a little less than 3 hours, producing a Keyword- 
in-Context index, an author index, and a bibliographic list. Machine 
costs were $55 for card punch and verifier and $225 for the computer. 
Keypunch operator salary was $300 for a total of $580 or an average cost of 
29 cents an article. This does not include, of course, the cost of printing 
multiple copies. 

It is this incredibly low cost per item which really makes it possible 
for librarians to consider applying this processing technique to their 
holdings which they now cannot afford to catalog. The near-print mate- 
rial, the dissertations, art collections, maps, the various vertical file hold- 
ings, and a host of other special materials are all potential candidates for 
this type of inexpensive bibliographic processing. The only professional 
effort required is to provide each item with a good title. As a rule, most 
libraries do this already. Another potential benefit is that such indexes 
can be produced in multiple copies and thus made available to depart- 
mental libraries, branches, and to individuals. 

In addition to producing library bulletins, accession lists, and indexes, 
computer techniques are also being used to provide direct personal ser- 
vice to individuals. With the vast increase in publication and the pro- 
liferation of journals, reports, and other separates, it is quite impossible 
for any individual to keep current in his various fields of interest. Nor is 
it practical for a librarian to screen all the inputs to his library and direct 
the pertinent material to the various individuals according to their sub- 
ject interests. The best he can do is to set out all new acquisitions for a 
few days so that the library clientele can scan the material or to reproduce 
the title pages and tables of contents and distribute them. 

The library user in browsing through the new acquisitions or check- 
ing the tables of contents is looking for clues to see if he should read the 
article or book. These clues are usually the words used in the title, the 
abstract, the opening paragraph, or the conclusions. There are, therefore, 
certain words which stimulate his interest. If he could list these words, it 
would be a simple matter for a computer to match them with the words 
which appear in titles, abstracts, opening paragraphs, and conclusions. 
If there is a sufficient match, then the article or book will probably be of 
interest to him. In a sense, the computer can browse through the new 
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literature and pick out the items of potential interest to an individual. 

This technique has been in use now over four years to provide a cur- 
rent awareness service to many individuals. Generally referred to as Selec- 
tive Dissemination of Information or SDI, there are a number of varia- 
tions of the system. As an example, let us look at one programmed for 
the IBM 1401 and which is available as program 1401-CR-DIX. A gen- 
eral description of SDI is given in IBM General Information Manual 
E20-8092, Selective Dissemination of Information, and in several referen- 
ces in the bibliography. 

The two elements which are matched in the computer are the biblio- 
graphic record of the citation and the interest register or "profile" of the 
individual. The user of the system makes up a list of words which reflect 
his interests; this is his "profile." He may choose the words from a special 
list, or he may write a paragraph or two describing his interests, these to 
be indexed by an indexer, or he may just list words that occur to him. 
Sometimes he is guided by examining interest profiles of others engaged 
in the same type of work. Figure 8 is an actual interest profile of an indi- 
vidual working in the field of information retrieval. 

There are two types of words on the list: exact terms and word roots. 
The root Librar will match on LIBRARY, LIBRARIES, LIBRARIAN- 
SHIP, LIBRARIAN, etc., whereas, the keyword Library will match only 
on LIBRARY. The reason the same root and full word are included is 
because library, the full word, is weighted as 3, but the root is weighted 
only as 1. The "hit level" of this profile is 03. This is a measure of the 
similarity that must exist between the profile and the document citation 
before the latter is considered a "hit." The mere occurrence of LIBRARY 
in a title or abstract will retrieve that document. However, if only LI- 
BRARIAN appears in the citation, then other profile keywords must 
show up so that the weight sum of matched keywords and roots equals or 
exceeds 3. 

Negative weights may also be assigned to keywords and roots. When 
these occur, then the total weight of the matched terms is reduced. 
Examples of negative weights are shown on the profile assigned to three 
journals. The requester in this instance reads these journals regularly, 
and he does not want to receive any notices of articles which have ap- 
peared in American Documentation , Library Journal, or Special Libraries. 
The hyphen in Information-Retrieval is present to tell the computer 
that this word pair must exist, otherwise the match will be terminated at 
the first space which is immediately after Information. 

The preparation of the citation can be done variously. It can be the 
actual cataloging or indexing done by the library. In fact, this multiple 
use of library inputs is the most economical. Actually, since the SDI pro- 
grams have so far in most instances been operated outside of libraries and 
have included literature not normally processed by libraries (news arti- 
cles, management reports, patents, patent applications, computer pro- 
grams, sales brochures, etc.), the inputs have had to be prepared sepa- 
rately. 
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After a series of tests, it was determined that the citations could be 
prepared by clerical help — professional indexers or catalogers or other 
technically trained personnel were not needed. In fact, since these pro- 
fessional personnel were often critical in their selection, they tended to 
omit useful material and were often slower in preparing inputs than the 
clerical personnel. The only instructions gives the clerical personnel, who 
were usually secretaries or keypunch operators, were to pick up author, 
title, journal citation, and, if an abstract were present, to copy it. If an 
abstract were not present, the clerks were to pick up the first few sen- 
tences of the introduction and/or conclusion and anything that looked 
"real technical," even if they did not know what it meant. The number 
of words was limited to a certain maximum, usually 20 lines of 60 char- 
acters each. Actually, in the specific applications there were also special 
instructions such as "be sure to include all references to equipment, to 
company names and locations, and industry references. Avoid repetitions. 
If the abstract is too long, remove modifying words, such as adjectives and 
adverbs, and other unimportant phrases and clauses." Such instructions, 
of course, varied somewhat with the subject areas being covered. 

When the computer matched the profile keywords with the abstract, 
it scanned the latter for all terms. That is, it looked at the whole text. 
Since a text might not have significant keywords, a simple form of index- 
ing was applied. A few so-called "directed terms" were manually assigned 
by the abstractors if they were not present in the portion of the article 
selected for SDI. An article on any aspect of librarianship had to have the 
word LIBRARY present. If it were absent, the abstractor added it. The 
"directed terms" were broken down by industry and application and in 
general did not exceed 5 or 6 terms per category for about 30-40 cate- 
gories. 

More important than the actual preparation of the references are 
their selection. Since many people read their core journals and do not 
regularly get to see the publications peripheral to their field, it is impor- 
tant that many publication sources, normally considered unimportant, 
be included. In the profile example shown above, the requester did not 
want to receive notices extracted from Library Journal, but he did want 
to be advised of any library or information retrieval article that might 
appear in non-library journals. Some of the early SDI systems were some- 
what less than successful because they concentrated on the "obvious" or 
"important" journals, and the only information they brought their 
clientele was already known to the users. 

Aside from providing a good and extensive selection of inputs, the 
other major problem has been the cost of SDI. One rather efficient SDI 
system serving some 2,000 users and processing between 800 and 900 
documents a month sent the average user 110 notices a month at a cost 
of $7 per month per user. SDI represents a new additional service with 
new additional costs; does not replace an existing library service and 
therefore represents no displacement savings. This additional cost must 
be justified. I need not tell librarians how difficult it is to justify, on a real 
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dollar and cents basis, the value of library and other informational serv- 
ices. Also, since most of the inputs represent materials not processed by 
the library, there is no real opportunity to share input costs. If SDI were 
to be confined only to library acquisitions, the SDI inputs would literally 
be for free. The machine processing costs are minimal and are directly 
proportional to the number of users being serviced. The major costs have 
been the reproduction and distribution of the notices to the users. (Sam- 
ples of these sample notices are shown in Figures 9-10.) Being printed 
individually on punch cards and being mailed individually in large 
quantities to many individuals, these notices have proved to be expen- 
sive. Consideration is being given to combining SDI and KWIC or SDI 
and library bulletins; then the SDI notice would simply be a single sheet 
which would act as a mailing cover for the KWIC index or library bulle- 
tin. This sheet would call the reader's attention to publications listed in 
the KWIC or library bulletin which are probably of special interest to 
him. This combination, SDI and announcement bulletin, has die added 
virtue of permitting the reader to browse in the bulletin while looking 
up his "hits." He, therefore, has fewer complaints about missing items 
in which he is interested but which he failed to include in his profile. 
Thus a single sheei would replace the Biattj imlividii.il eteffi and. Aak 
this sheet would also act as a mailing or distribution cover for the KWIC 
index or library bulletin, there would be an appreciable saving. People 
using SDI at present, however, are reluctant to give up the single card 
notification system since the cards are very prompt, many users receiving 
notices almost daily. 

In addition there are some other benefits in using the cards. In the 
second part of the notification card, which is the reply card, the user 
simply punches out with a pencil the appropriate pre-scored (Port-a- 
Punch) position and throws the card in the mail. This is his request if 
he wants the item. It also serves as a communication with the system 
operators and thus not only provides a measurement of the system's 
effectiveness, but can also be used to modify a person's "profile" or change 
the weights assigned to keywords or journals. (Figures 9 and 10). 

In one SDI system some 10,000 articles were screened each month, 
and of these 800-900 were entered into the system. The "average" user 
received notification of 110 of these articles or almost 1 per cent of the 
total which produced some 5 notifications per person each day. About two- 
thirds of the reply notices indicated that the item was "of interest", while 
about one-third were marked "no interest." The noise level was about 
34 per cent which seemed to be quite tolerable, since it takes only about 
one minute to read an SDI abstract and punch out the reply card. As the 
system developed, the profiles were modified, on an average, about 5 
times during the first year. This increased the "of interest" replies to 86 
per cent and decreased the "no interest" responses to 14 per cent. 

Just when the research or even public library will operate a current 
awareness program such as SDI cannot be foretold. However, once all 
library inputs do receive some computer processing, and this seems to be 
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inevitable for all but the smallest libraries, then an SDI service will have 
to be provided. Once the bibliographic information is machine scannable, 
the SDI outputs cost practically nothing, for essentially these are only 
the reproduction or printer costs. The dissemination costs will vary with 
the situation. In any event, the value of the service and the demands for 
it will, I am sure, force the librarian to become a true disseminator of 
information and not, as he is so often maligned, just a custodian. 
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status and how they are evaluated in a detailed survey of users' top 
management. 

KEYWORDS: Profit, Loss, EDP, Data Processing, Documentation, 
Information Retrieval, Top Management, Machine Utilization 

pp. 28 -32 



From 

SDI System 
HBM ASDD 




Would Request If It Were Available 
Of Interest But Would Not Request I! 
Of Interest. Have Seen Befoie . . . 



Of No Interest. 



YorktownHeigats N Y 


Push Ot 

Add mi* 


■CI. 


H.Im Whnn Writing 
jirgrM .ir C'jm-i.t ll F- 




No copies of this document are 










available from SDI, however please 102 2 19 3 3 








1 Read the abstract 










2 Respond by pushing out the appropriate box 










3 Return this card to SDI 


i 









Figure 9 



88 



Library Resources & Technical Services 



RAND DEE 123 pE0R 

•5HEEHAN ROBER T 

S £w K1N0 of FOftO MOTOR CD. 
FORTUNE FEBRUARY 1962 



R AND DEE 



12 3 iPEOR 



38 



INSTRUCTIONS: 

1 Read the Abstract 

2 Mark the Appropriate Box 

3 l( you care lo comment 
mark the comment oval 
and write your comments 
on this card 

4 Return this card to SDI 



0002082 



SELECTION NO 

Of Interest Document Requested 

Of Interest Document Not Wanted 

Of Interest Have Copy . . 

Of No Interest . , , . . . 



Comments 



Figure io 



Volume p, Number i, Winter 1965 



89 



The Library of Congress Project 



Gilbert W. King 
Itek Corporation 
Lexington, Massachusetts 



I 



N APRIL 1961 the Council on Library Resources, Inc., provided a 
S 100,000 grant 10 an agency of the U. S. Government, the Library of 
Congress, to examine the feasibility of automation of the large research 
libraries such as itself. A study group 1 was assembled and investigated the 
problems existing in the Library of Congress and other libraries, the possi- 
ble technical solutions likely to be available, and the present state of the 
art in operating a system from the point of view of programming and 
organization. 

The results of this study were published 2 at the end of 1963. The tune 
taken to complete this study reflects the care with which issues were ex- 
amined, especially in the light of rapid advances in the pertinent tech- 
nology in those years. A great many facets were investigated in more detail 
than might appear in the published report. Some topics were elaborated 
at the Airley Conference in June 19638 especially in the field of commun- 
ications in the type of system proposed, equipment lor which had been 
rapidly developing as the report was being written. 

In addition to all of this material which was published, many specific- 
pieces of the puzzle were examined which will come to light as, hopefully, 
the project advances. 

The principal features of the present plight of large libraries is, ot 
course, the large and growing volume of books and serials, which causes 
frustration to librarians, with their chronic lack of funds, in control, and 
10 die patron of the library in his endeavors to he informed. The library 
problem is worth serious national attention because, both traditionally 
and currently, collections of written material reflect the status and nurture 
the future of civilizations. The Library of Congress alone represents a great 
national resource; a mine of information which could be retrieved to ad- 
vance our culture and save very large sums of money in national projects. 
This is not to say that the report was deeply concerned with methodology 
of information retrieval being studied in many places at various levels of 
sophistication. It was directed at a purely practical means of controlling 
and finding bibliothecal material by more-or-less known interfaces with 

, The members of the group were G. W. King, Chairman; H, P. Edmundson: M. M. 
Flood; M. Kochen; R. L. Lihby: D. R. Swanson; A. WyUy. The group were greatly 
assisted by H. J. Duhestet; Barbara Markuson; D. F. Loeb: H. T. Spiro; A. D. Kotin. 
2 . Automation and the Library of Congress. Washington, Library of Congress, t 9 6 5 
$ Libraries and Automation, Proceedings . . . ed. by Barbara E. Markuson. Wash- 
ingctm, The Library of Congress, 1964. 
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traditional libraries, although the system concept was designed to permit, 
without radical equipment alterations, any and all new retrieval methods 
if and when they appear practical for very large collections. 

A solution of the libraries' problem was found and outlined. Many 
specific details were considered, but not described in the report because in 
an era of rapidly developing technology, many improvements can be 
anticipated by the time of substantial funding of the program. 

The principal equipment items that are needed for an automated li- 
brary are mass digital memories, mass graphical memories, a communica- 
tion network for digital and graphical information, consoles for man- 
system interaction, and switching equipment to monitor data flow. There 
is no point in basing a study of this sort solely on existing equipment now 
on the shelf. Attention should be focussed on what is likelv to be available 
at the appropriate points in time scheduled for a properly funded pro- 
ject. However, although an item would be feasible if a certain amount of 
development resources were devoted to it, there is some question whether 
such development will occur. Libraries need equipments devoted to cheir 
kinds of data— typically languages, not numbers. The library community 
is not an attractive market, in its present state of funding, to the equip- 
ment manufacturers. This is why it is so important that alarge-scale pro- 
ject, such as the automation of the Library of Congress, be adequately 
funded, for the whole library community to he able to bring itself up to 
the technological well being attendant to other fields. 

The storage of the massive amounts of lexical material in libraries in 
digital (machine readable) form such as on the card catalogs, not to 
mention citations, abstracts, tables of contents or full text, requires mem- 
ories of a size and organization not being met elsewhere in the data- 
processing industry. Nevertheless, suitable 'types of memories have been 
made for special projects, and it seems likely they could be furdrer devel- 
oped from the libraries* point of view. It is very likely that a digital mem- 
ory wuh a capacity of a trillion bits (thirty billion words) will be com- 
mercially available before fiscal year 1966. 

The storage of large amounts of materials, such as newspapers and 
serials, as photographic images of their pages is mandatory in an efficient 
library. Here again there is not available the equipment really exploiting 
modern technology for the needs of large libraries. For example, reduc- 
tion of 200 to 400 to one are quite practical for photographs of newspa- 
pers, this reduction making phenomenal improvement in space require- 
ment and speed of access. 

Methods of communication of the stored digital or graphical informa- 
tion to the requestor is solved, although costs for real time communica- 
tion over large distances are still high for libraries in dieir present status 
in the community. 

An automatic library requires terminal sets, or consoles, that is, means 
for the user to obtain ready access to the contents of the memories. These 
basically consist of a display such as a television tube, with, however, 
considerably higher quality, to allow textual material to be read easily.' 
Volume 9, Number 1, Winter 1965 . gl . 



Good quality and a large font of all the letters and symbols used in litera- 
ture are essential if substantial use of the library is to be encouraged. 
The same equipment should be able to display photographs of printed 
pages and pictures retrieved from the mass graphical file. The console is 
the site of operation of the user, so it must have certain convenience fea- 
tures. For example, there has to be a simple keyboard by which an un- 
trained user can communicate his requests to the system. It should have a 
local storage device on which the user can build up a file of the pieces of 
information he is retrieving, so that he can go back and forth in referring 
to it. It should have means of giving him low-cost hard copy of selected 
material he has been shown and temporarily stored. 

The console is perhaps the most troublesome piece of equipment 
visualized for the library system. For, although a console meeting the 
requirements could be built, its cost is likely to be quite high, unless the 
market, i.e., number of units, is quite substantial. It is to be hoped that 
the Library of Congress project will provide such consoles, acceptable to 
other systems in the library community. 

In addition to equipment, an automated library system needs organi- 
zation and methodology. The novel features of this are that the system 
itself has to help the user in locating the information he is after. It is not 
enough to let him ask simple questions, such as (a display of) the catalog 
card of a book by "Smith." He would be, in general, flooded by too much 
material. The program of the system must guide him, with questions and 
suggestions displayed at his console, in the way a reference librarian 
discourses with him. Normalized wording of his questions by an auto- 
matic thesaurus and even automatic normalization of syntax are distinct 
possibilities. In this area there is the opportunity of a vast improvement 
in the interaction of the user with a library. 

Related to the development of the automation helping in the use of 
its contents is the means of loading the memories with the large amount 
of material generated by a library in its bibliographic control and de- 
scription. There is the basic difficulty of transferring all the material now 
printed on cards, reference books, etc., into machine-storable (digital) 
form, but there is the serious matter of not losing information in so 
doing— in not losing the typographical and format clues which now aid 
the user in scanning printed material — aids which must be retained for 
the display at the console. Here again various studies and prospective 
solutions were studied as background for the report. These need a con- 
certed effort to become practical and economical, one in which the li- 
brary community as a whole should become involved. 

In considering file conversion, it is important to realize that imposi- 
tion of standardization as to coding, symbols, or format used is not at all 
necessary. On the contrary, it is highly desirable, and feasible in the system 
implied in the report, that essentially any type of organization or index- 
ing can be introduced into the central system. Internally, conversions 
from one structure to another can be made. 

Perhaps the most important feature of the system proposed, in general- 
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ities in the report, but in detail as background for the report, is its em- 
phasis on a structure which does not impose any constraints on present 
or future cataloging, classification, or indexing schemes. Thus, all of the 
bibliographic work which has been done in the past will be saved and in 
the long run incorporated for future users. In addition, new services, 
such as citations and bibliographies, generated by the use of the system, 
will be preserved, and make automated libraries more useful by far than 
their manual counterparts. 
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The MEDLARS Project at the National 
Library of Medicine 

Charles J. Austin, Chief 
Data Processing Division 
National Library of Medicine 
Bethesda, Maryland 



Introduction 



M 



rEDLARS (Medical Literature Analysis and .Retrieval System) is a 
r J_ computer-based information storage and retrieval system which 
recently became operational at the National Library of Medcine. It is the 
first phase of an aggressive program of research and development m 
documentation aimed at the improvement of biomedical communication 
and the use of modern technology in library operations. 

MEDLARS joins the intellectual talents of professional literature 
analysts to the tremendous clerical professing capabilities of an elec- 
tronic computer in a unique man-machine relationship. It is a biblio- 
graphic system aimed at control of a large segment of the world's 
biomedical literature and the rapid dissemination of this data to those 
eneaged in medical research and practice. 

The objectives of MEDLARS can best be defined by describing its 
three main output products. A more detailed explanation of the tech- 
niques used in obtaining these outputs is given later. 

The first MEDLARS product is the Index Medicus, a monthly subject 
and author index to some 2,500 biomedical journals published in all 
parts of the world. The computer improves the index in various ways. 
The capacity of the machine for storing and manipulating a large vol- 
ume of data makes expanded coverage of the literature possible, and the 
timeliness of the index is improved by reduction in die throughput time 
required to prepare the monthly editions for printing. The second prod- 
uct of MEDLARS is the recurring bibliography— a current-awareness 
list of citations in specialized medical subject areas. The system will pro- 
duce up to fifty different recurring bibliographies compiled at regular 
intervals from data in the computer files. These compilations will be 
prepared in photo-master form, and will be printed and distributed by 
organizations working in the specialty fields. Demand bibliographies rep- 
resent the third and final major product of the system. Rapid searching 
of the computer's store of data can provide answers to complex biblio- 
graphic requests which cannot be effectively handled by reference to a 
printed index or catalog. 

MEDLARS was developed under contract by the General Electric 
Company's Information System Operation in a three-phase effort. Phase 
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L a preliminary study and design phase, lasted from July 1961 to Janu- 
ary uj6s. This phase included development of a basic set of specifica- 
tions for equipment, programs, and personnel required to implement 
the MEDLARS objectives. Phase II, detailed design, be gan in January 
1962 and included ordering of equipment, writing and testing of com- 
puter programs, and development of detailed procedures to be followed 
by personnel operating the system. Phase III, system testing and imple- 
mentation, overlapped Phase II and included installation of equipment, 
file conversion, detailed testing of all parts of the system, and a period 
of preliminary operation. Phase III will end this summer. 

System Loads 

A brief description of system loads will help point out the need for 
mechanization. The current annual volume of approximately 150,000 
papers indexed is expected to increase to 250,000 by 1969. In addition, at 
some future date about 10,000 citations of monographs will be entered 
into the system annually. This represents an annual input load growing 
from 62 million characters currently to 100 million in five years. Output 
loads will be proportionately larger. The system will produce the 
monthly and annual Index Medicits, up to fifty recurring bibliographies 
of varying periodicities, and a number of demand requests expected to 
grow from an initial load of ten per day to ninety per day after five 
years. The output printing load will grow from about 290 million char- 
acters in 1964 to 590 million in 1969. 

Data Processing Equipment 

In order to meet the speed and volume requirements of the system, the 
following items of automatic data processing equipment are employed: 
thirteen punched-paper-tape typewriters for conversion of source data 
to machine-readable form; a Minneapolis-Honeywell 800 digital com- 
puter for editing, sorting, compressing, merging, and formatting data 
for subsequent printing; an optical printer called GRACE (Graphic 
Arts Composing Equipment) manufactured by the Photon Corporation 
and used to convert computer output into high-quality photo-copy; and 
an automatic film processor for developing the film from GRACE, dius 
producing a photomaster which can be used directly for burning of 
printing plates. 

With this description of the objectives of MEDLARS, system loads, 
and data processing equipment as background, the actual functioning of 
the system can be described in more detail. Any data processing system 
normally consists of three parts — preparation of input data, manipula- 
tion of this data automatically, and output preparation of the final 
products of the system. 

Input Preparation 

Journals received by the Library at the serial record checking area are 
forwarded to the Index Unit where they are distributed to the index- 
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ing staff for the translation of foreign titles and subject classification of 
articles with appropriate descriptors from the Library's controlled list 
of terms called MeSH (Medical Subject Headings). 

To transform this basic data into mechanical form, a punched-tape 
typewriter (Friden Flexowriter) is used. This machine produces hard 
copy as well as punched tape when the keys are depressed. The input 
typist integrates information from a data sheet prepared by the indexer 
with information from the journal article itself. In addition, codes 
are added so that the computer can recognize individual elements of the 
record. Hard copy from the typists is sight-verified by a proofreader who 
notes necessary corrections on a work sheet. This work sheet is returned 
to the typist who made the error, and a correction tape is prepared. Both 
the original input tape and the correction tape are later matched by the 
computer and necessary substitutions are made. The basic unit record 
prepared by this process consists of all of the standard elements of a 
bibliographic citation, plus the subject headings assigned by the indexers. 

Search requests for bibliographic information are prepared by a team 
of search specialists and recorded on a Search Formulation data sheet. 
Whereas indexers describe articles by selecting the appropriate subject 
headings, searchers prepare a set of elements which are used in logical 
combination for retrieving citations already indexed and in the computer 
files. The request may include as many as 100 search elements, including 
such items as subject headings, author names, journal titles, language 
designators, year of publication, and several others. The Search Formu- 
lation data sheet is punched into paper tape and proofread in a similar 
manner to the Indexer data sheet. 

Computer Processing 

Seven programming modules have been designed to lulfill MEDLARS 
system requirements. A module is a group of computer programs all of 
which are related in performing one of the system's major functions. 

After conversion to punched-paper-tape, indexed citations are entered 
into the Input Processing Module, which edits the data extensively, con- 
verts English-language elements such as subject headings into their coded 
equivalents, and builds the two main computer files on magnetic tape. 
The Compressed Citation File (CCF) is a highly coded, time-sequential 
store used for high-speed searching to retrieve demand bibliographies. 
The Processed Citation File (PCF) contains unit records in expanded 
print-line format which have been selected and tagged by this module 
for various recurring bibliographies. Once the two basic data files have 
been created by the Input Processing Module, the computer subsystem is 
split into two independent parts— one for retrieval of demand bibliog- 
raphies and the other for the composition of Index Medicus and recur- 
ring bibliographies. 

The Demand Search Module matches search requests entered into the 
computer by punched-tape against unit records on the CCF. It is impor- 
tant to note that several requests are entered in a large batch (perhaps 
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5° to 75)' an d a11 searches are performed simultaneously in the com- 
puter's central processor. Output from this module consists of a mag- 
netic tape file of retrieved citations plus a printed report of the number of 
citations retrieved for each request. 

The Report Generator Module prints bibliographies of citations re- 
trieved by the Demand Search Module. Sequence and format of these 
bibliographies are variable and are specified by the search specialist, who 
first reviews the report tell ing how many citations were retrieved and 
uses this information as a basis for determining output format. The 
specialist also determines whether the final product is to be printed on 3" 
X 5" cards, 814" X u" paper, or photo- type-setter film. The computer 
takes about five minutes to search one complete reel of magnetic tape 
containing roughly 25,000 citations. The file currently holds about igo, 
000 references entered since 1065, and a search of this file for the an- 
swers to several requests takes approximately 30 minutes. 

Composition of recurring bibliographies is the function of the Out- 
put Processing Module. Each working day, punched cards are entered 
into the computer to specify which recurring bibliographies are to be 
printed that day. The Output Processing Module selects citations from 
the PCF, using information from these cards. The module then sorts the 
citations into the proper sequence, formats the data, and writes it onto 
magnetic tape for later use by the photo-composer. 

There are three utility modules in the computer subsystem The File 
Maintenance Module is used to update the two main data files (CCF and 
PCF) with additions, deletions, and changes. The MeSH Generator Mod- 
ule performs a similar function with the master file of medical subject 
headings. The Statistical Module produces reports on the frequency of 
use of the system, such as the number of times each subject heading is 
used, the number of articles indexed by language, etc. 

The four main operating modules (Input, Demand Search, Report 
Generator, Output) are run on the computer once a day, but the utility 
modules are run at less frequent intervals and only when needed. 

Output Preparation 

The unique aspect of MEDLARS is its special output device, GRACE. 
This machine prints at the rate of 300 characters per second from a font 
of 226 characters (including upper and lower case). Data is transferred 
directly from the computer via magnetic tape and composed onto posi- 
tive photographic film or paper. 

GRACE will be used primarily for compiling Index Medicus and recur- 
ring bibliographies. The computer also has an on-line printer which is 
used for printing demand bibliographies prepared by the retrieval pro- 
grams. Although this printer is about six times as fast as GRACE, it has 
a limited type font of 56 characters, upper-case only. A d elav in delivery 
of GRACE one-year beyond the originally scheduled date has forced the 
Library to prepare the Index Medicus issues this year by the computer's 
on-line printer. This less-than-desirable format will be replaced by 
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GRACE composition in the very near future, and the Cumulated Index 
Merhrns for 1004 will be printed by GRACE. 

Magnetic tapes produced by die computer are mounted on the com- 
poser which prints complete pages on film in accordance with instruc- 
tions received from the tape. Upon exposure of 100 feet of film (the 
equivalent of about 100 printed pages), the film is removed and fallen 
to an automatic film processor tor developing. The processed film is 
later inspected, cut into sheets, packed, and delivered to a printer for 
platemaking, printing, and binding. GRACE eliminates the expensive 
and time-consuming job of typesetting required in many publication sys- 
tems. 

Data Processing at NLM 

Data processing is an integral part of MEDLARS since die very con- 
cept of the system is the rapid storage and retrieval of pertinent refer- 
ences from a large information file. For this reason the Data Processing 
Section was established in the Bibliographic Services Division of the 
Library in August, 1962. The primary responsibility of this section was 
to work with the MEDLARS contractor in installation of the system, and 
prepare to take over upon completion of 'he contract. 

On March 18, 1064, a new Data Processing Division was formed at 
NLM. This division was charged not only with carrying on the operation 
of MEDLARS, but also with initiating new systems studies involving the 
potential use of data processing equipment and techniques in other 
areas of the Libraxy, 

A Systems Analyst joined the Library staff in January of 1 964 to begin 
a systems study of technical processing. It is hoped that this study will 
iead to improvements in the selection, acquisition, and cataloging of 
new materials. Odrer possible future data processing projects include 
automation of the card catalog itself, investigation of graphic image 
storage and retrievai and its relationship to MEDLARS, possible mecha- 
nization of the serial record, and statistical analysis of the utilization of 
MEDLARS. 

The Library also plans to decentralize the MEDLARS search capabili- 
ties by distributing magnetic tapes to university and odrer medical re- 
search centers with adequate library and computer facilities. A pilot 
study will begin later this year, and will involve re-programming and 
tape conversion for MEDLARS searching on computer equipment at a 
large university medical center. 

Impact of MEDLARS on Other Libraries 

MEDLARS will result in much broader dissemination of medical bib- 
liographic data through its new and expanded indexing services. This, in 
turn, will impose great demands for increasing the size and scope of the 
collections of local medical libraries to provide the journals cited in a 
MEDLARS bibliography. The system will provide every local librarian 
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with a modernized version of a traditional reference tool for rapid 
searching of a large store of information, since he will be able to request 
demand searches from NLM. At the same time it will impose a serious 
obligation on the librarian to become familiar with the fundamentals of 
computers in order to understand their capabilities and limitations. Fi- 
nally, in the future some libraries may find it possible to acquire a mag- 
netic-tape edition of the Index Medicus for use on their own data proc- 
essing equipment. 

Summary 

MEDLARS is a computer-based information retrieval system aimed 
at bibliographic control of the world's medical literature. Medical 
journals received by the Library are indexed by a staff of professional 
indexers, punched into paper tape, and permanently stored on magnetic 
tape. Data from the magnetic tape files are manipulated by the com- 
puter in order to produce the monthly and annual Index Medicus, other 
recurring bibliographies in medical specialty fields, and demand bibliog- 
raphies in response to subject oriented questions. The Data Processing 
Division at NLM is heavily involved in the operation of MEDLARS and 
in systems studies aimed at the improvement of biomedical communica- 
tions and modernization of Library operations. 
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CATALOG CODE REVISION SCHEDULE 
In LRTS Fall 1964, p. 365, I stated that the catalog code revision schedule 
calls for a completed manuscript to be presented to the CCS Executive Com- 
mittee "in the summer of 1966, instead of (as originally planned) 1965." 
These dates are not correct. 

Actually the schedule calls for a completed manuscript by the summer of 
1965, instead of (as originally planned) 1964. — Paul S. Dunkin 
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The Machine and the Librarian 



Ralph H. Parker, Librarian 
University of Missouri Library 
Columbia, Missouri 

THERE CAN BE no question who will be the eventual master. When 
we hear expressions of fear of the machine, what is really meant is 
that we fear other men's use of it. For this, history provides ample justi- 
fication. The process of civilization may be characterized as the grow- 
ing use of mechanical devices and the struggles of mankind to direct 
that use. Without mechanical assistance man's energies were consumed 
in providing subsistence; there was no leisure for creative thought. But 
once a slight margin above subsistence was obtained, it became possible 
to utilize the resulting leisure either for social good or selfish gain. Even 
slavery and serfdom can exist only in a society where the means of 
production exceeds the demands of consumption. 

This difference between productive capacity and consumer demand 
has grown in a very tight spiral. As capacity has increased, so has the 
standard of living. That part of the difference which is turned back into 
machines of production serves to widen the spiral. What we know as the 
Industrial Revolution of two centuries ago ushered in the long period 
of providing mechanical power to replace human power. Steam was 
the key in the process which may be characterized by the power driven 
loom, power saws, and steam boats. We need not elaborate on the signifi- 
cance of these and similar inventions. Steam was followed by the inter- 
nal combustion engine, then by electricity, and now by nuclear fission 
and nuclear fusion. The machines which, in the nineteenth century, 
created unemployment and social upheaval are today taken for granted, 
and account in large measure for the change in standards of living. 

The machines which today are creating unemployment and exerting 
pressures on the social structure are part of what we might call the Pa- 
per Revolution. Its background includes the invention of the linotype, 
the typewriter, the rotary press, offset lithography, and teletype. More sig- 
nificant, however, are the machines which have descended from a cliff- 
ferent line. Charles Babbage and his calculating machine of the middle 
nineteenth century, Hollerith and his concept of punched cards, which 
in turn was borrowed from the textile industry of the eighteenth cen- 
tury, were fathers of the modern electronic computer. This product of 
the second half of the twentieth century is more than just a calculating 
machine. Its capacities for mathematical computation can also be used 
for the manipulation of non-quantitative data as well, and even for the 
control of other machines. Thus the machine, which in the eighteenth 
century replaced man as a source of power and converted labor to the 
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supervision of machines, is now followed by a machine which even re- 
places man as a supervisor of a machine. 

Of what does automation consist? Mechanization, yes; but two more 
things are needed: the concept of an integrated system, and what is com- 
monly called "feed-back." There is, of course, automation without com- 
puters, but their enormous capacity for feed-back vastly expands the 
potential for sophisticated systems. 

Automation is based on what we might call machine learning, the ap- 
parent ability of a machine to learn, a capability such that a decision 
once made can be remade by the machine without human intervention. 
Because much human activity is at this intellectual level— masquerading 
as thoughtful activity — machines have erroneously been endowed with 
the power of thinking. It is probably unnecessary to repeat that ma- 
chines do not think, that they simply do rapidly and accurately tasks of 
routine repetitive nature. 

Although there is awe and admiration for machines, there is also 
snobbery about things which they produce. The Paris gown, handblown 
glass where an imperfection is a mark of authenticity are examples of 
this snobbery. The Gutenberg Bible was once a cheap machine-made imi- 
tation of a manuscript! It is possible to create cheap machine goods; it is 
not now economically feasible to use hand labor for this purpose. Paper 
cups, which are a socially unacceptable substitute for porcelain and fine 
glass; paper napkins, which are becoming an acceptable substitute for 
fine linen; the toy balloon and the party whistle can exist only in a sys- 
tem of mass production. But factory production is not necessarily syn- 
onymous with poor quality. The electric refrigerator is a marvelous ex- 
ample of beauty, precision, and durability; no handmade product could 
equal it. 

Automated systems do create differences in people, however. The arti- 
san's sense of personal pride in his creation, whether it be a pair of shoes 
or a page of manuscript, has little place in a system relying on statistical 
methods of quality control. Until our society has developed a satisfac- 
tory substitute for this personal pride of workmanship there will be com- 
plaints of employees who do not care. This is an area of challenge for 
the professional of today and of tomorrow — the professional in indus- 
trial relations, the professional educator, and even the professional li- 
brarian. 

What can machines do in libraries, and what of the librarians them- 
selves? In spite of the predictions of some, computers will perform dull 
repetitive jobs in libraries in the same way that they are performing dull 
repetitive jobs in business and in scientific research. When mixed with in- 
geniuity of human beings, the results may well be marvelous. 

The books of which libraries are composed are rather mobile. Who 
hasn't heard allusions to them walking off or to the ease with which they 
may be borrowed, secreted, or purloined by the willful? Records, on the 
other hand, are viscous. The bibliographic organization of a library is 
far more durable than its books or the building which houses them. Li- 
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brarians may not have analyzed the problem in this light, but this viscos- 
ity is primarily responsible for their being called narrow-minded, unre- 
sponsive to the needs of the user, and even obstructive to culture and 
enlightenment. How often the librarian is unable to fulfill the wishes 
of the user because of the costs of recording the change in location of 
the book or of maintaining adequate control of its availability! 

The newer technology which is now emerging will free the record; it 
will be possible to reorganize collections for short term use as well as for 
long term repose. The distinction between the circulation file, the cata- 
log, bibliographies, and indexes will largely disappear. 

But there is a danger that the machine will become temporarily the 
nominal master. Machines have limitations in their capabilities, and the 
people who operate them tend to let the machine dictate what and how 
things are done. When this happens, it is a defeat for human ingenuity. 
Of course we must work within the confines of existing technology, but 
it need not be a strait jacket. 

Printing in more than one color was economically not feasible in the 
sixteenth century. Type designers overcame the limitations by develop- 
ment of variations in style and size of type. The expert photographer 
in black and white can use techniques to compensate for the loss of 
color. The illusion of depth can be created in a flat representation. We 
must not cease to strive for improvement in the technology with which 
we must live. Just because most data processing machines have type fonts 
limited to 39 or 47 or 63 characters is no reason why bibliographies or 
catalogs in libraries must be geared to them. If the use of machines 
means lowering of quality, assuming that our standards of quality have 
real validity, then we must avoid using the machine. 

The fact that it is not possible to program a computer to file by the 
ALA rules is not a valid reason to settle for records arranged in a se- 
quence which die computer is capable of creating. Although it may not 
be possible for a computer to figure out how to put an entry into the de- 
sired sequence, it can be taught to put it there after it has been shown 
the first time. And who will decide where it should be put in the first 
place? The Librarian, of course; but he need make the decision only 
once. 

Let us repeat, the machine cannot do intellectual tasks, and presum- 
ably the professional work of the librarian is intellectual. But the ma- 
chine can do repetitive ones and do them far more effectively than can 
the librarian. The effectiveness of the professional can be multiplied by 
storing the results of intellectual activity and retrieving them by ma- 
chine when they are needed. This is what we do, without a machine, 
every time we consult an index or even the text of a book. But the ma- 
chine can extend the fruitfulness of the original creator by searching 
and comparing and printing out only those items which match — or do 
not match — the criteria established. 

In short, automation of records in libraries will free librarians, 
whether they wish it or not, to become truly professional. Their jobs may 
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be far different, but they will still be needed. The library of the future 
will also need technicians to operate the machines, and for some time to 
come, it will need people to shelve books. The catalog typist, the file 
clerk, and the librarian who is only a paper shelver will largely disappear. 

These prospects may create some sense of fear, but they will have little 
impact on the change which is taking place in our society. For one thing, 
libraries are too insignificant a part of it; furthermore, most libraries 
have too much difficulty in obtaining people to do these tasks now for 
anyone to regret the passing of the jobs. In the long run, the capacity of 
the library for rendering service and the demands for that service will be 
so vastly increased that the number of people still working in libraries 
will continue to grow. 



THE DECIMAL CLASSIFICATION EDITORIAL POLICY 
COMMITTEE: ANNUAL REPORT 1963/64 

The chief business of the Committee at its meeting in October 1963 was the 
approval of final details of proposed changes in Edition 17. Perhaps the most 
important item was the decision to adopt a new method of indicating geographic 
arrangement through the presentation of the full list of geographic headings in 
standard subdivision .09 rather than referring from this point to the history 
schedules in the 900's. We believe that this procedure will make the application 
of the numbers easier and also that it will permit the logical separation of geo- 
graphic concepts from the present limitations of the historical approach. Among 
other actions, the Committee also reversed a previous decision to print both 
new and obsolescent schedules for Psychology and decided that only the pre- 
ferred schedule should be presented. 

The Foreign Survey of the Use of the Decimal Classification got under way 
in the Spring, under the general supervision of Edwin B. Colburn as Chairman 
of the Steering Committee. Harriet MacPherson was appointed as Director of 
the Survey but was forced to withdraw because of her health. Sarah K. Vann 
succeeded her, and secured the assistance of Pauline Seely as a surveyor. Both 
Miss Vann and Miss Seely have been abroad during the Spring. Miss Vann is 
planning a further trip during the Fall. A preliminary report has been made 
of suggestions so far received, but a final report awaits the completion of Miss 
Vann's travels. 

At the October meeting Wyllis E. Wright was elected Chairman and Carlyle 
J. Frarey Vice-Chairman for the term 1963/66, and Deo B. Colburn Secretary 
with an indefinite appointment. 

The members of the Committee during the past year have been Carlyle J. 
Frarey, Esther J. Pierey and Pauline A. Seely, representing the American Library 
Association; Edwin B. Colburn, Marietta Daniels Shepaxd and Wyllis E, 
Wright, representing die Lake Placid Club Education Foundation: Godfrey 
Dewey, Virginia Drewry and Joseph W. Rogers as the continuing members 
representing the Forest Press, Inc., American Library Association, and the Librarv 
of Congress respectively,— Wyllis E. Wright, Chairman 
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Statement on Types of Classification 
Available to New Academic Libraries 



Introduction 

NEW ACADEMIC LIBRARIES are faced with a rather important 
decision at their very beginning. What classification system will 
prove most adaptable and most durable over a long period of time? The 
Classification Committee, at the request of the Cataloging Policy and Re- 
search Committee, has studied this problem carefully. The statement 
embodying our conclusions does not consist of arbitrary recommenda- 
tions for one scheme or another, but sets forth the characteristics of the 
major classifications as they apply to different situations. The question- 
and-answer method has been used for convenience in helping a library 
define its own situation. The Committee was extremely fortunate in hav- 
ing as a member, Miss Gertrude L. Oellrich of Alanar Book Processing 
Center, who is actually engaged in classifying with several schemes simul- 
taneously and who, therefore, is in a good position to compare them. 

Definition 

The purpose of classification is defined in this statement as a systema- 
tic, subject-oriented arrangement for shelving, a location device for open 
or closed shelf collections of books, not for the classification of knowl- 
edge. 

Statement 

The field of study was narrowed to a survey of the comparative merits 
of the Dewey Decimal (DC) and Library of Congress (LC) Classifica- 
tions. Both these systems are growing, are being kept up-to-date with quar- 
terly revisions, and Dewey, at least, now has a users' guide to go with it. 1 
One or the other of these two classifications is now used in the majority 
of academic libraries. 

1. Is it important to consider other classification systems in addition to 
the Dewey Decimal and Library of Congress schemes? 

There are several other classification schemes available, most being 
used in libraries somewhere in the world. 

BLISS — This is used rather extensively in Australia. It is a good logical 
system, but is not being kept up-to-date for ease of usage. The manu- 
script version does not agree with the published version. Letter nota- 
tion. 

* Report of the Classification Committee, RTSD Cataloging and Classification Sec- 
tion, May 15, 1964. The Committee: Pauline Atherton, Joan Cusenza, Elva Krogh, Ger- 
trude Oellrich, Elizabeth Overmyer, Annette Phinazee, Phyllis Richmond, Chairman. 
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RIDER— This is a new scheme quite similar to the Dewey Decimal Clas- 
sification, but with a letter notation as in Bliss. There is no way of up- 
dating it except as individual libraries undertake the job. 

READER INTEREST— This system is more suitable for a public li- 
brary which must cater to constantly changing interests. 

UNIVERSAL DECIMAL— This is a European adaptation of the Dewey 
Decimal Classification. It is greatly expanded in the science and tech- 
nology sections to serve the purposes of scientific documentation. 
Except in occasional areas, the rest has scarcely more depth, and, in 
some cases, less modernity than Dewey. It is too lopsided for a general 
library, but would be suitable for special collections in scientific and 
technical subjects. A scientific- technical edition in English, with a 
good guide written by Jack Mills, is available. 2 This classification is up- 
dated periodically. If a new and centralized secretariat is established 
and the major revisions now under consideration are adopted, it may 
be of greater significance than it is now. A classification system to 
watch. 

COLON-This system is used at various establishments in England and 
m India. The current (6th) edition schedules are rather limited in 
scope. It is very difficult to use because it necessitates an attitude of 
mind that is totally different from that employed in any other classi- 
fication process. At present there is no good guide to its use. The ex- 
planatory portions in the 6th edition are extremely difficult reading 
and less than clear. 

FACETED CLASSIFICATION— The schemes of this type developed so 
far are for specialized subjects. Until a general system is developed, 
this type of classification is not suitable for a generalized library. 

2. What characteristics influence the choice of a classification? 

COMPREHENSIVENESS— LC is much broader and more comprehen- 
sive than DC. It permits finer (closer) classification. The "P" sched- 
ules, in particular, are tremendous in size and, while hard to learn to 
use, have much "elbow room." 

FLEXIBILITY — LC has the advantage of not being logical in exposi- 
tion, as a rule, and while it is practically impossible to memorize, it is 
easy to expand without upsetting existing classified books. The ad- 
vantage of a non-logical classification is apparent in dealing with 
rapidly advancing subjects, as the sciences, where a major change in 
thought can throw out a whole branch in a previous arrangement of 
knowledge. LC can interpolate where DC must compromise. 

Dewey has to be expanded through further breakdown, sub-clas- 
sification or re-naming and reassigning classes. LC can be expanded 
by interpolation because the whole system does not have to be logical 
but can, to a considerable degree, grow like Topsy without regard to 
its environment. It has been possible to abridge Dewey, but not LC. 

LC permits variation in the treatment of specialized topics. Sayers 3 
states that LC was the "first to recognize the necessity for variations 
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of treatment as between the different classes, and it is this feature of 
the scheme which has found so much favor in academic libraries." 
COMPLEXITY— The mixed notation of LC is more complex than the 
pure notation of DC. However, Gulledge 4 stated that the LC num- 
bers are on the average shorter than DC. 
INDEXING — LC has no combined index and this is considered a fault 
by users. The relative index of DC has been praised, although the in- 
dex for the 16th edition is inferior to that of the 14th edition. 
BROWSABILITY— DC has the advantage of providing browsability. In 
open stack libraries, this is important. It is practically impossible to 
browse with LC although people try it all the time. 
NOTATION — Dewey's notation is positional, each position represents 
a classification level. LC notation is ordinal. Each class has a number 
of its own not necessarily related to preceding or following classes. 
CLASSES— LC has three times as many classes as DC. Neither classifica- 
tion fits the college curriculum. 
INTERPRETATION OF USE— DC seems to be superior in this because 

there is a Users' Manual for DC. 
SYSTEMS OF SUBDIVISIONS— The system seems to be better in LC, 
but the tables are difficult to use. Students have some difficulty learn- 
ing to build numbers in DC, but once learned, the application is uni- 
form throughout the system. However, if one uses LC cards there 
may be fewer instances when numbers have to be built. 
REVISIONS— Both LC and DC are now being kept current with quar- 
terly corrections. 

Problem areas noted are fiction, translation, literature subdivi- 
sions, political subdivisions, and study and teaching. Both schemes 
have been criticized. 
3 . Is the choice of LC or DC a function of the size of the collection? 

It seems to be an accepted fact in the literature on classification that 
the LC scheme, because it lacks general numbers for many areas, does 
not serve the small library needing broad classification. 

The ceiling for the 15th edition of Dewey was for libraries of 200, 
000, though this ceiling was lifted for the 16th edition to include li- 
braries of whatever size. However, in a survey conducted by Thelma 
Eaton among college and university libraries, only in libraries of less 
than 200,000 volumes was the value of DC stressed. 

The LC classification is used by 300 university, special, and govern- 
mental libraries in the United States and abroad. The scheme does not 
lend itself easily to abridgment for use in libraries with small collec- 
tions, and serves best in libraries with large collections or special libraries 
which require minute subdivision of limited subjects. 

The Committee recommends Dewey for libraries with general collec- 
tions up to 200,000 volumes in size, and the Library of Congress system 
for those expected to be larger and for those small libraries with spe- 
cialized collections in the social sciences and humanities. 
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4. What local or existing factors must be considered in making a choice 
of a classification? 

Various questions were considered: 

if the library is in a state system, what is the rest of the state doing? 

should such factors as the use of fixed location (shelving by size), open 
or closed stacks, availability of a centralized catalog or even adoption 
of some special subject heading system be considered? 

Apparently none of these factors had much to do with the choice of a 

classification system, since nothing could be found in the literature of 

librarianship relating to them, 

5. Is a divisional library vs. a central library a reason for preferring one 
classification or the other? 

In their description of the process of adapting the Dewey Classifica- 
tion for use in a college library, Ashton and Hansen note that "no system 
has been devised with the divisional plan as its basis." On the other hand, 
they do not recommend attempting to develop one. They modified the 
DC system to suit their needs with only 10% of the collection requiring 
reclassification. But their conclusion was that "Dewey, as it now stands, 
confuses the divisional library issue. "« The LC system, having more 
classes, could be more easily adjusted to the divisional concept. 

6. Which classification, LC or DC, is more satisfactory for centralized cata- 
loging? 

A. Library of Congress characteristics: 

(1) LC cards give class numbers plus LC-style Cutter numbers in the 
following proportion (barring law and lesser known languages): 
85% • 

(2) There are fewer changes in LC class numbers than in DC. (see, for 
for example, literature periods between the 14th and 15th editions 
of Dewey). 

(3) LC will often serve specialized departments better than DC, and 
since a single system is easier for the whole college, the LC system 
becomes the chosen one where centralized cataloging is done. 

B. Dewey Classification characteristics: 

(1) DC numbers appear on LC cards for about 35% of titles. This 
means that about 65% of cards purchased from LC have no DC 
number. 

(2) DC changes cause confusion when reprinted LC cards are used 
and require constant professional attention. 

(3) DC is too permissive. This is a boon to custom cataloging or to 
local cataloging preferences, but a Pandora's Box in centralized 
cataloging. 

e.g. geography and history combination. 

rearrangement of class to bring related classes together, such 
as philology and literature, 400 and 800. 
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biography in 920 or subject number. 
b ibliography in 016 or subject number. 

extension of class numbers or building numbers beyond what 
is given on the LC card. 

shortening of numbers when class number ends m .01 .08, 
etc Requires common sense— one cannot stop at zero (821.0) 
if a general rule for no more than four numbers has been 
made. 

7 . Should the classification numbers on the LC cards be accepted in pref- 
erence to making local changes? 

The consensus in the literature is that catalogers should accept the 
Library of Congress classification choices in preference to making local 
changes. The reasons for this are: 

(1) This practice makes the cataloging function easier and more eco- 
nomical. 6 , , 

(2) Few if any, libraries use all the numbers assigned by LC, but tne 
majority indicate that they accept 90-99% of the numbers on cards 
and proof-sheets. 7 

(3) Dawson actually examined cards in a selected number of libraries 
and found only 84.45% of the numbers were accepted. 8 

Some of the reasons given for NOT accepting the numbers given on 

cards are: . , , 

(1) Changes in schedules since older cards were printed give obsolete 

numbers. 

(2) Absence of numbers. . . . 

( 3 ) Failure to accept wholly or apply consistently the various revisions 
of the classification systems. 9 

8. What are the relative costs of using LC and DC? 

The comments below are based on daily observance of LC cards in a 
catalog department, a limited study of 500 LC cards as received, and a 
study of 500 LC entries in the National Union Catalog, excluding law 
and lesser known foreign languages. 
Classification nos. on cards (excluding law & lesser known languages) 





Daily cards 


500 cards 


Nat. Union Cat. 
LC card entries 


NO LC class number 
NO DC class number 


10% 

50% 


16% 
63% 


15% 
70% 



To supply class numbers wnere cnese aic wtMug ^ ^ - — 1- 

from 1 to 10 minutes per title by a person familiar with the LC and DC 
systems. Obviously, the DC is more costly in this respect, and only the 
advantages derived from its use can counterbalance this cost. It is said 
that fewer DC numbers will appear on LC cards in the future. 
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To this basic cost, which requires the employment of a professional 
cataloger for DC, must be added: 

(1) variations due to edition changes on reprinted cards 10% 

(2) extension of the DC number when libraries prefer close 
classification. Longer numbers are now being given on 



more recent cards 5% 

(3) assignment of Cutter numbers for every title 100% 

(4) assignment of fiction number (i.e. 813) if libraries do not 
use F or have no number for fiction 100% 



The percentages here occurred in the 500 cards. A closer study might be 
useful. 

Considering the LC classification, to the basic cost of supplying 15% 
of the numbers must be added the following: 

(1) supply numbers for all subjects treated from the legal aspect 

10% of the 500 cards 

(2) supply literature numbers for PZ3 100% 

(3) supply numbers for the papers, proceedings, etc., of universities, 
societies, etc., where LC has assigned an A class number (for the 
society) instead of a number for the subject in the paper 

e.g. Riabov's Rules of Motion of Artificial Celestial Bodies, 

LC cards gives: 629.1388 (a case of 

AS36.U56 no. 1021 former DC number) 

Some colleges want this material with the subject, so an LC num- 
ber must be supplied for it. 

In some cases only an LC A class number is assigned and no 
Dewey number, necessitating a new class number with either classi- 
fication. 

It is obvious that the LC system is less expensive even though some LC 
numbers are lacking and the problems of law, fiction, and series remain. 

9. Which classification would be easier to use for a mechanized system? 

Several factors have a bearing on which classification would be most 
easily mechanized. 

STORAGE — Since the classification notation must be converted to Dou- 
ble Digit form, the storage space for each class number would be twice 
its length. 

Storage units necessary would depend on the type of machine. 
For example: 

Decimal machine (IBM 7070 series) 10 digits to the field, allowing 

5 letters or numbers per field. 
Binary machine (IBM 7090 series) 18 digits to the field, allowing 
9 letters or numbers per field. 
Few LC numbers run higher than 10 digits (20 in the Double Digit 

form), which would take 2 fields in either machine. 
For Dewey, if a maximum size were not predetermined, one would 
have to use the longest number in the library as base for determin- 
ing how many fields were needed. 
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INPUT-OUTPUT PROCEDURES in compiler language— WRITE 
TAPE or PRINT routines would be very simple for Dewey, but com- 
plex for LC or UDC. The three statements below are written in 
l ; 0RTRAN lor a number in each svsiem: 

FORMAT ( 3 H TK. F4.O, lHM, la) LC (TK7872.M45 Maseis) 

FORMAT (F9.6) DC (629.134354} Rockets) 

FORMAT (F 4 .i, 2 H+, X,F 4 .i) UDC (655.1+ 688.1 Print- 

ing and Binding) 
Obviously Dewey is easier for the machine to handle on output since 
there is less work to be performed internally. The same is true of 
input statements, again using FORTRAN as an example: 
READ 5, K 

5 FORMAT (2A1, F4.O, Ai, I2) LC (TK7872.M45) 

5 FORMAT (F9.6) DC (629.134353) 

5 FORMAT (F 4 .i,Ai,F4.i) UDC (655.1+ 688.1) 

Some of the connectors in UDC could not be used in their present 
form because they already have a meaning in FORTRAN, (e.g. / = 
start a separate line) Again, Dewey is easiest to handle as far as the 
machine is concerned. 
USE OF DC for storage and compiler— If DC were arbitrarily limited to 
15 places after the decimal, it would require 2 fields for storage in a 
binary machine of the 7090 type, or 3 in a decimal machine of the 7070 
series. It would take the format (F18.15) in FORTRAN for both input 
and output. 

The largest LC number would require the same field space in the 
binary machine, but less in the decimal. It would take more machine 
processing in FORTRAN compiler language. 
AN ALL LETTER NOTATION, with a decimal for subdivision, would 
be even better than DC from a classification point of view, and no 
worse from the machine point of view. Such notation does not now 
exist. 

IF A MECHANIZED SHELVING AND FETCHING SYSTEM is de- 
veloped to replace stack men, classification as a shelf location code 
could end. Books could be filed by accession number or some other 
numerical system, or by size, etc. In such a case the dictionary catalog 
might be replaced or supplemented by a classified catalog. 
Classification itself could be more completely developed as an organi- 
zational system if it did not have to serve as a shelf location device. 
Thus, relationships among concepts and structure could be de- 
scribed. Multiple generic relationship classifications could be made 
by computer. 10 

REFERENCES 

1. U. S. Library of Congress. Decimal Classification Office. Guide to the Use of Dewey 
Decimal Classification; Based on the Practice of the Decimal Classification Office 
at the Library of Congress. Lake Placid Club, Essex County, N. Y., Forest Press of 
Lake Placid Club Education Foundation, 1962. 



1 lO 



Library Resources ir Technical Services 



2. British Standards Institution. Guide to the Universal Decimal Classification (UDC). 
London, British Standards Institution, 1963. (B.S. 10000:1963). 

3. Sayers, W. C. B. A Manual of Classification for Librarians and Classifiers. 3d ed. 
London, Grafton, 1955. 

4. Gulledge, J. R. "LC vs. DC for College Libraries." Library Journal, 49:1027. 1924. 

5. Ashton, J. R. and Hansen, O. B. "Adaptations of the Dewey Classification to a 
College Divisional Library." Journal of Cataloging and Classification, 10:86-91. 
April 1954. 

6. Seely, Pauline A. "Dewey 16th edition." Library Resources and Technical Services, 
6:179-183. Spring 1962. 

7. Hoage, A. Annette L. The Library of Congress Classification in the United States. 
Unpublished D.L.S. dissertation, School of Library Science, Columbia University, 
1961. 

8. Dawson, John M. The Acquisition and Cataloging of Research Libraries. Unpub- 
lished Ph.D. dissertation, Graduate Library School, University of Chicago, 1956. 

9. Scott, Edith. "Cooperation and Communication in Cataloging and Classification." 
Southeastern Librarian, 8:136. 1958. 

10. Doyle, Lauren B. "Indexing and Abstracting by Association." American Documentor 
tion, 13:378-390. October 1962. 



NOMINEES, 1964/65 
Resources and Technical Services Division 
For Vice-president (President-elect): 

Wesley C. Simonton, Library School, University of Minnesota, 

Minneapolis, Minn. 
Wyllis E. Wright, Williams College Library, Williamstown, Mass. 

For Director-at-large: — three-year term: 

Dale M. Bentz, State University of Iowa Libraries, Iowa City, la. 
Mrs. Avis G. Zebker, Brooklyn Public Library, Brooklyn, N. Y. 

Acquisitions Section 
For Vice-chairman (Chairman-elect): 

Stephen W. Ford, Grand Valley State College, College Landing, 

Allendale, Mich. 
Carl Jackson, University of Colorado Libraries, Boulder, Colo. 

For Secretary: — three-year term: 

Marietta Chicorel, University of Washington Library, Seattle, Wash. 
Mrs. Connie R. Dunlap, University of Michigan Library, Ann 
Arbor, Mich. 

For Member-at-large: — three-year term: 

Henry C. Koch, Michigan State University Libraries, East Lansing, 
Mich. 

Robert D. Stevens, Director of Research Collections, East-West Cen- 
ter, Honolulu, Hawaii 

Volume 9> Number 1, Winter 1965 . m . 



Cataloging and Classification Section 

For Vice-chairman (Chairman-elect): 

Alex Ladenson, Chicago Public Library, Chicago, 111. 

Marian Sanner, Enoch Pratt Free Library, Baltimore, Maryland 

For Member-at-large: — three-year term: 

Margaret C. Brown, Free Library of Philadelphia, Philadelphia, Pa. 
Mrs. Ruth F. Strout, Graduate Library School, University of Chi- 
cago, Chicago, 111. 

For Secretary: — three-year term: 

Richard O. Pautzsch, Brooklyn Public Library, Brooklyn, N. Y. 
Hilda Steinweg, Ohio University Library, Athens, Ohio 

Copying Methods Section 

For Vice-chairman (Chariman-elect): 

Joseph Popecki, Catholic University of America, Washington, D. C. 
Stephen Salmon, Washington University Libraries, St. Louis, 
Missouri 

For Secretary: 

Jonathan R. Ashton, School of Library Science, Queens College of 

the City University of New York, Flushing, N. Y. 
Albert J. Diaz, Microcard Editons, Inc., Washington, D. C. 

Serials Section 

For Vice-chairman (Chairman-elect): 

Robert D. Desmond, The Library of Congress, Washington, D. C. 
Thomas D. Gillies, Linda Hall Library, Kansas City, Mo. 

Member-at-large : — three-year term : 

Mrs. Jacqueline W. Felter, The Medical Library Center of New York, 

New York, N. Y. 
Mrs. Elaine A. Kurtz, U. S. Book Exchange, Inc., Washington, D. C. 



112 



Library Resources & Technical Services 



Second International Study Conference 
on Classification Research: Conclusions 
and Recommendations* 

i. Present Situation 

The first International Study Conference on Classification for In- 
formation Retrieval was held at Dorking under the auspices of FID in 
1957. It was the continuation of the efforts undertaken by the FID over 
a number of years, in its CC and CA committees and at its annual con- 
lerences particularly at the Rome lf |j and Brussels ,955 conferences. 

in the seven years since Dorking, much progress has been made, both 
m toe design of classification* systems and in the application of machines 
to inlormation retrieval. There were five nations represented at Dork- 
ing. Individuals from 16 countries and two intern a lional bodies attended 
the present conference. The scope of the second conference is much 
loader than the first. This reflects the growing interest and increased 
volume of research being carried on in this field. 

It is no longer necessary to insist on the role of classification in i„. 

CSS! T^ 6 , 311(1 retrieVal NBWfc. The earlier doubts on the 
feasibility o machine retrieval have largely disappeared. Moreover, it 
nas neen widely recognized that paradigmaticaP organization is an essen- 

been Hnrlfi 1 ^J'*®**** ma(:hine Many theoretical issues have 

been clarified, and progress in engineering capabilities for the processing 
of large information files has been sign ficMt 

Important contributions to classification theory have been made bv 
various disciplines, such as structural linguistics, semantics, mathematics' 

S/h e ^ ml m-^P&imnm testing of existing cfassification 
systems has been pursued on an increasing scale. The individuals at the 
present conference reflect this multi-disciplinary approach to the prob- 

ems in classification. The purpose of the present conference has been 
to consider the situation and point the way toward productive future 

2. New Directions 

a.i. Theoretical research 

The existing body of theory is in need of further elaboration on 
various lines, for both general and special classification systems, as well as 

* Held at Elsinore, Denmark, Sept. i 4 . l8 , 1964; approved Sept. 18, 1964 This paper 
was issued at the close of the Conference. PP 

inrt^H ", 3 SSificati ° n " is meant method seating relations, generic or other, between 
ndmdual semanuc umts, regardless of the degree in hierarchy contained in the system" 
and of whether those systems would be applied in connection with traditional more 
or less mechamzed methods of document searching. 
2 See footnote to 2.1 (a) below. 
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for such established applications as shelf arrangement, card catalogues, 
indexes, bibliographies on the one hand and machine systems of various 
degrees of mechanization and automation on the other hand. 
This theoretical approach embraces among other things: 

the study of the mutual interrelationships between thought and 
language, i.e. the connection between concepts, relations between 
concepts, and their expression in the natural language; 
the linguistic study of terminology in scientific and technical fields; 
the construction of controlled vocabularies, thesauri (with or with- 
out hierarchical relations expressed) as well as classificatory pre- 
coordinated structures; 

the study of various methods for embodying "analytic" relations 
given by context (so-called syntactical structures); 
the analysis and evalutation of the functional relationship between 
the various components of systems (including classification, codes, 
and equipment); 

the study of the behavioural processes, e.g. the inductive processes, 
(both at the individual and group levels), which largely determine 
the choice of semantic categories. 

In this connection, a number of specific questions relating to classi- 
fication theory should be investigated further, such as: 

(a) the possible separation of paradigmatic arid syntagmatic rela- 
tions 1 ; 

(b) the use of universally applicable categories or categories applica- 
ble to several fields; 

(c) the domains of application and conditions for the use of inte- 
grative levels; 

(d) the formal (mathematical and logical) foundations of classification; 

(e) the relevance of a classification system to the subject being classi- 
fied, taking into account related semantic questions from the 
socio-psychological point of view; 

(f) data classification (look-up systems) as contrasted with document 
classification; 

(g) the optimal stage of precision in classification language when 
expressing complexity. 

At the frontier of theoretical research and practical application we 
should investigate: 

(h) symbolization (notation) problems; 

(i) relationship between "general" encyclopedic classification schemes, 
and "specialized" classification schemes. 

2.2. Applications 

Theoretical studies mentioned above should be applied to: 

i These terms being respectively equivalent to: lexical and syntactic structures, verti- 
cal and horizontal axes, etc. 
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(a) the improvement of existing classification, including work on 
methods for construction of thesauri and related tools; 

(b) the achievement of better design in new classifications; 

(c) the exploration and implementation of compatibility among 
classification systems and thesauri, including standardized vocab- 
ularies; 

(d) the convertibility of the records of material indexed in one sys- 
tem into another; 

(e) the study of the interaction between the classification systems and 
computer technology in the process of system analysis and pro- 
gramming; and the effects of the cooperation between the classi- 
ficationist and the systems engineer. 

2.3. Evaluation of Classification Systems 

The objective of work in this area should be to obtain generally 
recognized and standardized techniques for evaluation as well as the 
measurement of the dimensions of a classification system. It is necessary 
to devise: 

(a) more adequate experimental and operational test designs, 

(b) better evaluation techniques, 

(c) mathematical models for the more precise and reliable descrip- 
tion of systems, 

(d) better methods for the comparison and evaluation of classifica- 
tion systems. 

Tests and evaluation of existing systems in a variety of disciplines 
should be encouraged. An international cooperative effort should be 
made on collections of sufficiently large size to test the utility of classifica- 
tion systems. There is a need for further work along these lines: 

(e) tests which include the users of classification systems, the classi- 
fiers and indexers, and to uncover divergencies between index 
description and the author's own analysis of his paper. 

(f) studies of the reliability and consistency of results of classification 
performed by different classifiers within one classification system 
and/or various systems; 

(g) more precise and reliable methods of measuring documentary 
relevance to search queries. 

2.4. Automated Classification 

The problems related to the construction and possible application of 
automated classification could have appeared under all the preceding 
headings, but the high level of interest in this area at this time called for 
special and separate treatment. 

Automated classification includes (1) the mathematical derivation 
of classification schedules (the work of the classificationist); and (2) 
the automated assignment of documents to categories (the act of class- 
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ing which is the work of the classifier), regardless of whether the cate- 
gories were automatically derived or were chosen from a classification 
scheme previously devised. 

The automated assignment of documents using a prepared classifica- 
tion shedule is already operationally feasible. The other kind of auto- 
mated classification has been subject to some experimentation. Further 
experimentation on an adequate statistical basis is desirable in order to 
determine the scope and limitations of these procedures in comparison 
with other classification methods. Statistically reliable studies are needed 
to determine how the automated classification compares with the vocab- 
ulary and the document distribution applied by classifiers and users. 

3. Ways and Means 

3.0 Steps have been taken to systematize terminology on a national level, 
in order to standardise and simplify the vocabularies used £or de- 
scribing classification systems. It is recognized that standardisation 
of terms requires prior coordination of concepts by the different 
specialists concerned. It is urged that this effort be organized by an 
international body such as FID/CR or ISO. 

3.1 There is already some cooperation among research teams on an infor- 
mal basis. This should be extended and placed on a more organized 
footing, particularly through international exchange of research per- 
sonnel, exchange of data and computer programs, fellowships, etc. 

§,8 A register of classification research projects in progress should be 
maintained by some suitable organization. A frequent and prompt 
publication containing information on new activities is needed. 
Clearing-house arrangements should be made for the collection and 
provision of information on classification systems and thesauri for 
special fields, and materials available in machine code for the use 
in classification research. Critical reviews of research progress should 
be encouraged. 

3.3 Shortage of personnel for classification research is a serious obstacle 
to progress. Measures to improve this situation by training are being 
made and should be encouraged in institutions of higher learning. 
Attempts should also be made to interest research workers from cer- 
tain other disciplines in classification research problems. 

3.4 In accordance with the FID Bureau recommendation of June, 1964, 
the formation of national groups for the study of classification and 
retrieval languages should be encouraged where such groups do not 
exist. The FID/CR Committee is to be considered as the link be- 
tween these various groups. 

3.5 More financial support will be required for classification research 
and its supporting organizations. The responsibility for financing 
research projects is primarily in the hands of national organiza- 
tions for scientific research, foundations, and international govern- 
mental organizations, like Unesco, and non-governmental organiza- 
tions like FID itself. Eventual inclusion of this list of priorities for 
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fundamental research on classification problems in the FID Long 
Term Programme could stimulate additional support and encourage 
further cooperation among the various organizations concerned. 
3.6 Further symposia or study groups should be organized at reasonably 
short intervals. They should be planned in such a way as to facilitate 
detailed study and critical review of work in progress. 



Variant Pricing of Serial Publications 



Peter Gellatly, Serials Librarian 
University of Washington Library, Seattle 

THE PROBLEM of the variant pricing of serial publications is a 
complex one. Fixed pricing is by no means the rule and seldom oc- 
curs, except in the case of such large, mass-distributed publications as 
Life, Time, the Saturday Evening Post, and so on. But even here one can- 
not plump too solidly in his conclusions. The rates are seldom as rigidly 
fixed as might appear at first glance. They are, in fact, almost always 
lower in the first year of subscription than at any other time, simply be- 
cause it is harder for a publisher to acquire a new subscriber than to re- 
tain an old one. In the following years also, the rates can vary, and often 
do. Serial prices are anything but stable, and no easy assumptions can be 
made about them. 

The way in which a subscription is obtained matters. For the private 
individual, buying a subscription at his door or over the telephone is 
the least economical way to do it (few subscriptions are available in this 
way, despite the omnipresence of the salesmen); and, for that matter, 
publications bought from salesmen cost about as much as they do on the 
newstands, the large and resounding offers of the salesmen notwithstand- 
ing. It is a much better idea for the private individual to go directly to 
the publisher for his subscriptions. Libraries, of course, usually make 
their purchases through jobbers for the convenience this arrangement 
affords, and while occasionally they receive from the jobbers the favor- 
able long-term subscription rates that are available to subscribers in 
general, this is not always the case. They have to be content with what 
they get; and whether or not a reduction is available depends upon the 
jobbers themselves. The fact that many libraries have budgetary prohibi- 

» Paper prepared on request of the RTSD Acquisitions Policy and Research Com- 
mittee. 
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tions against advance expenditures also influences the rates they pay. 
Most serials are paid for in advance; and unless it is possible to go two or 
three years into the future, there is little hope of securing the multiple- 
year advantage. However, in cases in which orders are sent out in the 
"until-forbidden" way, jobbers invariably make long-term purchases, 
and even occasionally pass on some of the savings they obtain in this way 
to the libraries concerned. In such cases, a pro-rated adjustment is re- 
quired when premature cancellation occurs; but even so, the arrange- 
ment is usually of benefit to the library. 

The rates vary for another reason. Publishers, or at least their circula- 
tion managers, are increasingly anxious to sell their publications, and 
one device they use for doing this is to offer bonuses of various sorts to 
their subscribers. Bonuses are typically offered to new subscribers as an 
enticement to subscribe, but sometimes also to old subscribers at renewal 
time, on the theory evidently that some compensation has to be offered 
them in view of the increased rate charged after the first year, and cer- 
tainly in view of the lower rate granted new subscribers. The bonuses 
for new subscribers are, of course, in the way of price concessions. This is 
rarely the case where renewals are concerned, although multiple-year 
rates are offered which become progressively more favorable as the sub- 
scription period lengthens. What is more usual is for a publisher to offer 
a few extra issues for a prompt renewal (with cash enclosed as a means of 
reducing bookkeeping costs to the publisher), and less frequently, a small 
book, often but not always made up of writings gleaned from past issues 
of the publication in question. The Harvard Business Review and Chang- 
ing Times, among others, issue books that are used in this way. 

In the matter of renewing subscriptions, it is of interest to note that 
bargaining exists — not in the common sense of haggling, but in a real 
sense nevertheless. This is what happens. The longer a subscriber waits 
before he makes his renewal, the lower the price becomes. With each re- 
newal notice, the publisher makes a concession until at last, in a splurge 
of what must be tic-making generosity, he offers a rate that very nearly 
approximates the first-year rate. Not all publishers are amenable to the 
threat of a discontinued subscription, as not all subscribers are to the 
blandishments furnished by the publishers for their continued loyalty; 
but certainly bargaining is possible to a point in dealing with the publish- 
ers of the large, commercial periodicals. While such experimenting in the 
open market is fine for the individual, no library would think of indulg- 
ing in it — not presumably because it lacks dignity, but because it subjects 
the library to the danger of having its subscriptions interrupted or even 
cut off. Most publishers are almost paternally indulgent these days, and 
one must wait a long time before this extremity is reached; but reached 
it finally is. Losing a subscription through default is annoying for anyone, 
but a minor disaster for a library. 

Price variants assume many forms. One with which we are all familiar, 
sometimes happily and sometimes not, is the service-basis pricing used, 
among others, by the H. W. Wilson Company, according to which a li- 
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brary pays for Wilson publications at rates depending upon its income 
and the size of its collection. This idea is socially useful in that it allows 
every library the possibility of purchasing Wilson publications (many, of 
course, quite indispensable), regardless of its size and solvency, and is 
certainly commercially astute in that it ensures a large sale for these pub- 
lications. Its disadvantages have been pointed out often enough, but the 
only one that seems to matter much nowadays is the excessive amount of 
paperwork that subscribers find themselves involved in from time to time 
in order, among other things, to keep the rates up to date. 

To say that librarians are addicted to the pay-what-you-can-afford 
principle is perhaps saying too much, but this principle appears fairly 
frequently in the pricing of library publications. As the cost of profes- 
sional memberships varies with the applicant's salary, so the price of 
some library publications varies. The latest example of this to come to 
notice concerns the new Canadian Library Association publication, 
Canadian Library Horizons, which is available to individual subscribers 
at $9 a year, to libraries with incomes of less than $100,001 at $10, and to 
libraries with greater incomes at $25. There are many variations on this 
particular theme. In the case of the new Economics Library Selections 
List, published by the University of Pittsburgh, the rate for students is 
$1 a year, for professors $2, for university and public libraries and for un- 
affiliated individuals $10, and for all others $15. This elaborate price 
schedule, while it was not produced by librarians, is typical in that it il- 
lustrates the way in which price schedules are arranged to secure' an ad- 
vantage for special groups. 

Instances occur in which libraries are charged less for their sub- 
scriptions than are individuals; and certainly the opposite case is not 
unknown. Libraries are considered fair game by some publishers, who 
feel no doubt that selling a subscription to a library will decrease their 
private sales. This assumption is hard to disprove, but equally hard to 
prove. It is not at all certain that a person who uses a publication in his 
local library would continue using it if he were required to take out a 
subscription himself; but of course he might. Still the anti-library (or 
better, perhaps, the pro-individual-subscriber) bias persists. For all this, 
many publishers now seem to be coming to the point of making a special 
effort to woo libraries by giving them favorable rates rather than using 
discriminatory rates against Lhem. This is particularly so in the case of 
the publishers of learned journals who are aware that a large part of 
their subscription revenue comes from libraries. Firms that issue mass- 
directed publications, on the other hand, are less concerned with the in- 
come obtained from their library subscribers and so are less inclined to 
give them special consideration. 

One notable example in which an effort has been made by a publisher 
to attract library subscribers is seen in the dealings of the Pergamon Press. 
Subscriptions are supplied by this firm at what it calls its "A" and "B" 
rates. The "A" rates, which are considerably lower than the "B", apply 
to libraries and to various other sorts of institutions, while the "B" rates 
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are assayed against individual subscribers (including, one assumes, in- 
dustrial firms and manufacturers.) Individual subscribers are further 
discriminated against in that they are required by Pergamon to certify 
(whatever that means) that they will use their subscriptions for their 
own personal research and neither lend nor sell them. Pergamon has not 
always been so magnanimous in its treatment of libraries (nor, for that 
matter, so indifferent to the needs of the individual), but evidently the 
realization has been borne in upon it that as libraries are its best cus- 
tomers, some concessions ought to be made to them. 

Libraries are often the recipients of less generous treatment (although 
it must be pointed out that Pergamon's prices are high, even in the "A" 
category)., and charged considerably more for their subscriptions than is 
the individual subscriber. The feeling here, if it is not hostile to libraries, 
is scarcely sympathetic. Apparently it is thought that libraries will pay 
any amount for a good publication, whereas individuals are limited in 
the amount they can afford to pay and should be shown some preference 
in the rates they are charged. A case in point is Iron Age, a publication 
sold to individuals at $2, but for which libraries must pay $25. Other pub- 
lications which are more expensive to libraries than to individual sub- 
scribers: Arithmetic Teacher, Automotive Industries, Journal of Hetero- 
cyclic Chemistry , Mathematics Teacher, Progressive Architecture. 

Preferential treatment is accorded in still other ways. Members of 
learned and professional organizations, for instance, often receive the 
publications of their organization as a membership privilege. In this re- 
gard, one need only mention the ALA and the generosity with which it 
sends out its publications. Such an arrangement, while still very common, 
is becoming less so as the cost of publication increases. (It appears that 
the cost of publication is mounting at a considerably greater rate than is 
the general cost of living, but this is beside the point.) A more usual ar- 
rangement is for an organization to make its publications available to its 
members at a reduced rate, generally from thirty to fifty per cent of the 
cost of these publications to the ordinary subscriber. Examples abound, 
but a few will serve to give an idea of how concessions are made in favor 
of member-subscribers. Speculum is sold to members of the Mediaeval 
Academy of America at a reduction of twenty per cent. Scandinavian 
learned societies offer their publications to members at a standard and 
invariable thirty per cent off list-price. Finally, the American Institute of 
Aeronautics and Astronautics makes its many publications available to 
its members at a fifty per cent reduction. 

As for library memberships, there are a number of things to be said. 
First, a library may become a member of a learned society or other organ- 
ization willy-nilly simply by subscribing to one of its publications. This 
is a frequent enough happening; and any library with a reasonable col- 
lection of serials finds itself also in possession of a number of inadvertent 
memberships, most of which bring it nothing but the desired publication. 
There are others, however, that bring all sorts of secondary publications, 
some perhaps wanted and some perhaps not. Second, it is occasionally 
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necessary far a library to take out a membership before it is put on a 
society's mailing list. This is usually no hardship, although there may be 
a charge both for the membership and for the publications. In some such 
cases, the favorable subscription rate for members more than offsets the 
cost of the membership, but this is not always so. It might well be that a 
large technical library would find paying the $5oo-a-year membership fee 
of the American Institute of Aeronautics and Astronautics an advantage, 
since the number of publications put out by this organization each year 
and made available to members at half-price is enormous. On the other 
hand, many libraries would find so expensive a membership economically 
unfeasible. The advantages and disadvantages of the membership ar- 
rangement have to be considered in each case separately. Third, member- 
ships are not always available to libraries but sometimes to individuals 
only. This creates no difficulty for the library if the needed publications 
can be obtained without the necessity of membership; and certainly no 
library can complain if it gets the publications it is after without ex- 
cessive cost and bother. It is true, moreover, that society publications are 
almost always made readily available to libraries, either as gifts or pur- 
chases. Fourth, in cases in which personal membership is mandatory, 
there is sometimes difficulty in finding a librarian with appropriate qual- 
ifications, but generally in such cases the society is willing enough to ease 
its regulations in order 10 oblige the library and ensure, incidentally, that its 
publications are given a place in the library's collection. 

How does a library find its way about in a situation in which an in- 
creasing number of its serial publications have more than one price? The 
question remains unanswered. There is nothing predictable, or even en- 
tirely rational, in the pricing of serial publications; and what is paid for 
them depends upon a number of considerations, not the least of which 
is the publisher's state of mind at the time the purchase is made. One 
thing more: despite the almost frenzied efforts of publishers to sell their 
publications, real bargains are as rare now as they have always been. 
The hard sell is here to stay, and it should be recognized for what it is. 



EDITOR RECOMMENDS: 
"Push Button Bibliography Today and Tomorrow." Bulletin of Bibliography. 
«4: 73-78, 86-88, May-August 1964. Kenneth SludFer (Director. Graduate School 
of Library Science, Simmons College, Boston), Ralph Parker (Librarian, Univer- 
sity of Missouri, Columbia), and Ludwig Sickmann (teacher at die Bibliothekar— 
Lehrinstitut, Cologne) discuss the realistic and false hopes of automation in 
bibliographic retrieval. 
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The Advantages and Disadvantages of a 
Classified Periodicals Collection* 



Joseph C. Borden, Associate Librarian 
University of Arkansas Libraries, Fayetteville 

ON THE QUESTION, "What are the advantages and disadvantages 
of a classified periodicals collection compared with an unclassi- 
fied collection," it will be desirable at the outset precisely to define or de- 
scribe a classified periodicals collection and an unclassified periodicals col- 
lection. The periodicals belonging to a library need to be shelved in such 
a way that they can easily be found when needed. This is true whether 
the periodicals are bonnd or unbound. If the periodicals are not classified 
in the library by some scheme of library classification such as die Dewey 
decimal classification or the Library of Congress classification, then they 
are usually kept in alphabetical order by title. In eidier case, all issues of 
the same title are kept together, supposedly in chronological order, ex- 
cepting possibly the current issues. 

Classification of the periodicals may be defined as the placing of 
numbers or letters (known as call numbers, class numbers, or class marks) 
on the title to achieve its being in the library with other publications on 
the same subject and near to other publications on related subjects and 
in some logical numerical or alphabetical relationship to other publica- 
tions in the same or related subjects. Classified periodicals in a library col- 
lection are usually shelved among or alongside die classified books in the 
same or related classifications. An unclassified collection ol periodicals 
may be defined as the arrangement of periodicals in a library by some 
other method dian by classification; and at the present time the only 
other practical method in use is the alphabetical method of arrangement. 

Large general collections of periodicals, such as in university librar- 
ies and large reference libraries, usually hold to the classified arrange- 
ment, whether the stacks are open or dosed; while some smaller collec- 
tions, such as in special libraries, college libraries, and public libraries, 
hold to the unclassified or alphabetical arrangement of their periodicals. 
For example, Gloria Whetstone, in a recent survey of serial practices in 
i(i selected large university binaries, found that 12 of the 16 university 
libraries classified all their periodicals, two more classified some of their 
periodicals, and the other two, namely Wayne State and Rutgers, did not 
classify any. 1 On the other hand, in special libraries, where most of the 
periodicals are in a special and restricted subject field, the classifying 
generally would not serve a useful purpose. 

* Article based on a paper delivered before the Arkansas Resources and Technical. 
Services Group Meeting, Little Rock, October 28, 1963. 
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Since the individual library will be already committed to one method 
or the other, it may be asked who is interested in the comparative merits 
of the two means of shelving periodicals. Aside from the theoretical in- 
terest of the subject, it is a matter which some librarians have to face in 
actuality from time to time: when a new library is established, decision 
must be made about classifying or not classifying the periodicals; when 
a going library experiences difficulties with its periodicals, it may wish to 
consider changing to other means of shelf arrangement; when a library 
is to occupy a new building, it may want to review its way of shelving 
periodicals; when a faculty member or student in an academic library 
may ask why classify or why not classify, it is useful to know the answers. 
For all of these reasons it is worthwhile giving some thought to the 
question. 

Advantages of Alphabetical Arrangement 

Processing is less costly for the unclassified or alphabetical arrange- 
ment of periodicals. This refers not only to the savings made by not 
having to classify each title in the first place, but also may carry through 
to such details as the savings from not needing to mark the call number 
on each bound volume. From the use standpoint, the finding of the title 
is more direct, since it is not necessary to find the call number first. In an 
open shelf arrangement it follows from this that the user of the library 
can find what he needs more quickly, or at least more directly. 

Keeping the periodicals in one part of the building by themselves in- 
stead of scattering them with the books makes it easier for persons in- 
terested only in periodical references since all of the periodicals are to- 
gether. This also simplifies matters for the technical staff working with 
periodicals. For the same reason, if the library enters a period of ex- 
panded acquisitions in the field of back files of periodicals, when more 
shelf space for a particular title or a particular section of titles is needed 
than originally planned, the necessary shifting is less cumbersome than 
if the books also should need to be displaced and moved. 

Checking holdings for ordering from offers of periodicals from book- 
sellers is also simplified, as a single alphabetical shelf-list should be easier 
to consult than a classified shelf-list. 

Advantages of A Classified Arrangement 

In a classified arrangement, changes in a periodical title permit shelv- 
ing the whole file of the periodical together despite the title change. 
Similarly, the various bulletins, transactions, and proceedings of the same 
organization can be easily kept alongside one another. Users interested in 
a particular subject or group of related subjects, not only all the periodi- 
cals in the subject, but also the books in that subject, will find them rel- 
atively close to one another. 

The classification of the periodicals usually scatters periodicals of 
similar titles and thus tends to prevent confusion in shelving and finding; 
e.g., such titles as begin with the words Journal of the ... . In addition, 
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difficulties in alphabetizing caused by such small parts of speech in titles 
as prepositions and articles can be eliminated by classifying. It, later, the 
collection is to be broken up into departments, a classified arrangement 
will make the change easier than would a non-classified arrangement. 
Non-indexed periodicals in special fields are virtually "lost" to the 
scholar unless they are classified by subject. 

Looking over the respective advantages of the two types of arrange- 
ment, one realizes that the chief advantage to the library user of the un- 
classified shelving of periodicals is the direct approach made possible by 
the simple alpha beiical arrangement and that the chief advantage to Ure 
library user of the classified arrangement of periodicals is the grouping 
together of periodicals dealing with the same or similar subjects. The 
respective disadvantages of the two types of Arrangement are the obverses 
of their advantages. It is perhaps worthwhile to take the time to list the 
disadvantages, also, with comments. 
D isadva n tages of A Iplm h c I ica ! A rra ngrm cut 

When a periodical changes its title, the volumes under the new title 
will be separated from the volumes of the old title if the alphabetical ar- 
rangement is adhered to. This disadvantage is not difficult to overcome 
partially, by the use of dummies on the shelves, but there is a certain in- 
convenience to it. 

Titles beginning bulletin, proceedings, transactions, etc., of the same 
organization will be shelved with the bulletins, proceedings, transactions, 
etc., of other organizations if shelving is strictly by tide. Such shelving 
separates. For example, the proceedings and transactions of the same 
organization, which for the user or the library will sometimes be in- 
convenient. This difficulty can he overcome in many cases by shelving 
such files under the name of die corporate body issuing them. (Paren- 
thetically, it should be added that, some library users, such as research 
scholars, especially scientists who are journal-oriented, would prefer in 
have proceedings 'and transactions kept under their titles rather than 
moved to entry under name of the organization, this for the reason that 
scientific indexes and abstracting services enter under "Bull.", "Proc", 
"Trans.", etc.) 

The periodicals will not be grouped by subjects, which makes them 
less convenient for the use of scholars interested only in particular sub- 
jects. In many cases this will not be a real difficulty. For example, in li- 
braries with closed stacks, most persons needing the periodicals will not 
care whether the arrangement is by subject or not; and in most cases the 
users of periodicals in libraries are found to want a particular citation 
or several particular citations in a given title or in given titles and are 
consequently interested only in rinding these citations and not in brows- 
ing among many tides on die same subject. 

Titles beginning Journal of .... and certain other titles which begin 
similarly, will be together in one section of the periodical collection. This 
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is not a serious difficulty, except perhaps to the casual user, and can be re- 
duced by careful attention to the alphabetical rules under which un- 
classified periodicals are shelved. The same comment may be made on 
die disadvantage of long titles containing prepositions and articles 

If the collection is to be later transformed into a departmentalized 
collection by broad areas, such as humanities, social sciences, and phys- 
ical sciences, the transformation will be more difficult where the titles 
have not been previously classified. However, this is merely an incon- 
venience to the staff at the time the change is made and should not be a 
strong argument against leaving the periodicals unclassified. 

The availability and contents of a periodical not indexed in one of the 
indexing services tend to be more or less overlooked in an unclassified 
collection. 2 

Disadvantages of Classified Arrangement 

The processing of periodicals in a classified arrangement will be more 
costly. This is not really a disadvantage if it works to the well-being of the 
library, For example, it might be said that binding periodicals is more 
costly than keeping them unbound, bm the advantages of binding for 
reference purposes are so apparent that the cost is absorbed as a matter 
of course. The same thing happens with the cost of classifying in libraries 
which classify. It is considered a necessary expense, fully justified by the 
advantages the library believes it receives from the classification of its 
periodicals. 

The user of the library, or the staff member, must find the call num- 
ber before he will be able to obtain the desired periodical. Thus it will 
take longer for him to get to the citation than in a library where the 
shelving is strictly by title. This is generally true of periodicals with sim- 
ple titles, but will not necessarily be true of more complicated titles, where 
the alphabetical arrangement, theoretically simple, may prove to be more 
difficult for tire locating of given titles than the classified arrangement 
with its call numbers. It should also be remembered that in an open-shelf 
library widi a classified periodicals collection the library staff members 
and the constant users will know the classification scheme sufficiently well 
after a time to be able to go directly in the shelves without the need to 
consult the catalog or other listings first. 

Persons concerned only with periodical references will find the peri- 
odicals scattered among the books of the library when a classified arrange- 
ment is utilized. The disadvantage here is really a matter of inconven- 
ience. Even when the periodicals are arranged alphabetically, there is a 
certain amount of inconvenience in spacing from one part of the alpha- 
bet to another. And where the periodicals are classified, there is a clear 
gain of convenience for persons working largely in one field of knowledge. 

The disadvantage to a classified collection when shifting time comes 
is that books as well as periodicals will need to be shifted. This disadvan- 
tage, fortunately, is experienced relatively infrequently. 

The checking of holdings of periodicals for ordering purposes is 
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more cumbersome when the periodicals are classified than when they are 
shelved alphabetically. This disadvantage is absorbed entirely by the staff 
and does not trouble the user of the library. It is in the same category of 
disadvantages as the cost aspect of classification. 

It is the specific purposes and objectives of the individual library 
which should determine whether or not the library should classify its 
periodicals or classify some and not classify others. For instance, who will 
use the library — the general public, a general faculty and student body, 
selected advanced scholars, or subject specialists? How is the library 
used — open shelves so that the public is getting its periodicals from 
the shelves directly, or closed shelves so that staff only has access to the 
shelves? What is the arrangement of the building — is it practical for th? 
periodicals to be in among the books, or do shelf considerations make the 
separation of the periodicals advisable? 

The average librarian today does not encounter the problem of 
whether or not to classify periodicals. The decision will have been made, 
probably years before the current librarian came on the job. Yet the fact 
that there are two general ways to arrange periodicals and the fact that 
both ways have their adherents should lead the librarian at least to think 
whether his periodicals will be better utilized one way or the other. Some 
libraries have made the decision to change from one system to the other 3 
and have found the change worthwhile, either in reducing staff time in 
servicing periodicals or in improving their availablity. It behooves the 
librarian to analyze the purposes and objectives of his collection and then 
to decide whether better to hold with what he has or to adopt and change 
over to the other system. 
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DO SCIENTISTS 
NEED A 
COMPREHENSIVE 
INDEX TO 
ALL OF SCIENCE? 

When we asked scientists this 
question recently, most of them 
replied yes — a few said no. If 
you are among the few who are 
perfectly satisfied with the "selec- 
tive" approach of conventional 
indexing systems, stop right here. 
However, if you, along with the 
majority, feel that improved meth- 
ods and comprehensiveness are 
needed in information retrieval and 
dissemination, read on. 

Every research scientist has, on 
one occasion or another, been 
stumped by the complex problems 
of searching the scientific litera- 
ture. Today's search requirements 
are hobbled by the limitations of 
traditionally organized indexing 
systems. 

At the Institute for Scientific Infor- 
mation, we have tested and devel- 
oped a new dimension in indexing. 
To this we have added compre- 
hensive coverage, resulting in a 
unique tool for information dis- 
covery. We call it Science Citation 
Index. 

The Science Citation Index instantly 
identifies the most recent publi- 
cations referring to a particular 
work since its publication. The 
Science Citation Index provides the 
scientist with an improved, novel 
starting point: the specific paper or 
book of a specific author . . . and 
from there the searcher is brought 
forward in time to current papers 
relating to the earlier work. 

Revolutionary? Invaluable? We 
think so and we think you will top 
after you've had an opportunity to 
review descriptive material on this 
new technique. Write us now and 
we'll send details by return mail 
— without obligation, of course. 

Dept. 23-2 

INSTITUTE FOR SCIENTIFIC INFORMATION 

325 Chestnut Street Philadelphia Pa. 19106 



The H. W. Wilson Company ta\es pleasure in announcing the 
first issue of Biological & Agricultural Index, the new suc- 
cessor to Agricultural Index. A detailed subject index to 146 
periodicals published in the United States, Canada, and the 
British Commonwealth, Biological & Agricultural Index be- 
gins publication this month and is expected to be of great 
value to libraries that need ready reference to magazines in 
the fields of: 

agricultural chemicals forestry & conservatism 

agricultural economics genetics 
agricultural engineering horticulture 
agricultural research microbiology 
agriculture mycology 
animal husbandry nutrition 
biology physiology 
botany plant science 

dairying & dairy products poultry 
ecology soil science 

entomology veterinary medicine 

feeds zoology 

Periodicals were selected for indexing in Biological & 
Agricultural Index by the subscribers to Agricultural Index 
and include 78 publications oriented toward biology and 68 
oriented toward agriculture. The form of indexing is similar 
to that used in most of the other Wilson indexes, with subject 
headings based on those used in the dictionary catalog of the 
Library of Congress, and numerous subheadings and cross 
references to facilitate quicks reference. Entries include the 
title of the article, author, periodical, volume, inclusive pag- 
ing, and date; bibliographies, illustrations, tables, graphs, and 
diagrams are also noted. 

Biological & Agricultural Index will be published monthly 
except in September, with bound annual cumulations, and is 
available by annual subscription on the H. W. Wilson service 
basis. Each subscriber will be charged only for the indexing © 
of those periodicals received by his library. For a quotation of * 
your service basis rate, write today. % 

t 

THE H. W. WILSON COMPANY f 
950 UNIVERSITY AVENUE, BRONX, NEW YORK 10452 f 



Presenting 
a Master Key to 
the greatest treasure house 
of applied scientific 
knowledge 
in the world. 



important development in information retrieval 
. as beneficial to the reference librarian 
the Cumulative Book Index. The International 
lex of Patents is the shortest route to 
of the information, engineering and scientific 
:htiology to be found in patent literature. 

veloped by the Interdex Corporation with the 
operation of the U.S. Patent Office, and 
tributed through Bro-Dart Books, Inc., these 
ectories are fully cross-referenced to enable you 
answer any question. 

iterial is indexed by Date ... by Patent Number 
. by Subject ... by Class and Subclass ... by 
indard Industrial Classification. There is even a 



list of those patents, in each field of interest, 
issued before the inception of the numbering 
system in 1836. 

Subscriptions are being accepted now for the first 
six volumes, covering every United States 
Chemical Patent issued from 1790 through 1960. 
Three additional units are in process, covering 




Foreign Chemical Patents 
and U.S. and Foreign 
Electrical Patents, as part 
of a program to publish 
the complete body of 
patents covering all fields 
of endeavor. 



For further information on the International Index of Patents please write to: 

BRO-DART BOOKS, Inc. 

Dept. HOIA • 1609 Memorial Ave. • Williamsport, Pennsylvania 



For Libraries That Want Quality 
Bookbinding 



GLICK BOOKBINDING CORP. 




Specialists in the Binding and Rebinding 
of Books and Periodicals 



Serving Institutional, Public 
And Research Libraries 
Since 1905 



We Have Moved — Our New Address Is 
32-15 37th Avenue 
Long Island City 1, New York 

STillwell 4-5300 

In Nassau and Suffolk In New Jersey 

IVanhoe 3-9534 Mitchell 2-5374 



